When I design a cell layout I usually assign a sample text, e.g. 'John Appleseed' to a 'name' label so I can easily see where the field is on the layout and check the composition. Otherwise there is an empty label on a white background. Obviously this text does not need to be translated as it will be always replaced by another value at runtime.
Is there any property I can set in the Object Inspector to exclude this text from .strings / XLIFF file? Translators usually charge per word, so I don't want to send those texts for translation.
For the time being I use '~' prefix and then remove those texts using a Ruby script, but I was wondering whether there is an easier way to do it.
Unfortunately, if you are using ibtool (and you do not really have an alternative) you cannot exclude words directly.
What you can do, is to edit the XLIFF file after you export it and add the attribute translate="no" on the strings you want to exclude. You should make sure that your translators use a XLIFF-compatible tool to translate.
But, imho, this is not any better than your way.
Also see question 1, question 2 and ibtool's manual.
Related
I am trying to read a text in a given rectangle using readText() function.
The function works correctly except when it has to read some text which has special characters like ' _ & etc.
I tried using validCharacters with readText() function. But it didn't help.
Code -
put ReadText((287,125,810,164),validCharacters:"_-'.ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01234567890") into Login
I tried working with character collections. But that doesn't seem to be right because the text trying to pick is a dynamic text combination of numbers alphabets and a special character. So one cannot create a library of character collection of every alphabet (a-z, A-Z), numbers(0-9) and special characters.
Example of text trying to read:
Login_Userid1_1, Login'Userid1_1
So how do I read such text correctly
Debugging OCR is a bit of an imprecise science. EggPlant has a lot of OCR Parameters to tweak. When designing test cases it's best to try use other mechanisms to gather information whenever possible. ReadText() should be considered a last resort when more reliable methods are unavailable. When I've used it I've often needed to do a lot of trial and error to find the right set of settings, and SearchRectangle to get consistent results. Without seeing exactly what images you are trying to read text from it's difficult to impossible to troubleshoot where the issue might be.
One thing that does stand out to me is that you're trying to read strings that may contain underscores. ReadText() has an optional property IgnoreUnderscores which treats underscores as spaces. By default this property is set to ON. It defaults to ON because some OCR engines have problems identifying underscore characters consistently.
If you want to have ReadText() handle underscores you'll want to explicitly set this property to OFF.
ReadText(rect, validCharacters:chars, ignoreUnderscores:OFF)
I know that PO / MO files are meant to be used for small strings like button names, labels, etc. Not long text like an About page, etc.
But lately I am encountering a lot of situations that are in the middle. For example, a two sentence call to action. Or a short paragraph.
Is there best practice or "rule of thumb" for when a string is too long to put in a PO file?
update
For "long" text I use partials and include the correct language version. My question is WHEN is it optimal to use one vs the other. I've heard that PO files are "inefficient" for "long" pieces of text. But what does that mean and when is it too "long"? Or is this not a concern?
Use one entry for a self-contained chunk of text; e.g. a sentence as you say.
Two sentences that belong together and don't make sense without each other should be one entry. Why? Because otherwise the translator wouldn't have the context necessary to translate it well. Same goes for a short paragraph, e.g. explaining a setting: if it's inseparable in the code, it should be one entry.
If you encounter a situation where you have lots of long texts regularly (e.g. entire pages or paragraphs of pages), that's usually a sign that you are using an ill-fitting tool. Some people do it, using Gettext for entire articles, but you're better off having separate documents in such cases. But that doesn't seem to be the case here.
I'm using the same zachrone iphonepdf but I did not get the text. My text view shows nothing. What's the problem?
Here is my code:
NSString *text=convertPDF(#"Course.pdf");
texview.text=text;
But I did not get anything in text view?
The text extractor zachron / pdfiphone (I assume you meant that one) is extremely naive and makes very many assumptions.
It ignores the PDF file structure and, therefore completely ignores whether the data it inspects is still used in the current revision.
It ignores encryption and therefore will fail completely for many documents with usage restrictions.
It completely ignores font encodings and implicitely assumes an ASCII'ish one --- this is fairly often true in small PDFs with English text only and not embedded fonts; otherwise the result can be anything.
... many many more assumptions ...
Unless one only has to deal with very simple documents and the extracted text is not really necessary for the functionality of one's code, I would propose using different code for text extraction.
I am looking at gettext and .po files for creating a multilingual application. My understanding is that in the .po file msgid is the source and msgstr is the translation. Accordingly I see 2 ways of defining msgid:
Using full text (e.g. "My name is %s.\n") with the following advantages:
when calling gettext you can clearly see what is about to be
translated
it's easier to translate .po files because they
contain the actual content to be translated
Using a key (e.g. my-name %s) with the following advantages:
when the source text is long (e.g. paragraph about company), gettext calls are more concise which makes your views cleaner
easier to maintain several .po files and views, because the key is less likely to change (e.g. key of company-description far less likely to change than the actual company description)
Hence my question:
Is there a way of working with gettext and .po files that allows combining the advantages of both methods, that is:
-usage of a keys for gettext calls
-ability for the translator to see the full text that needs to be translated?
gettext was designed to translate English text to other languages, and this is the way you should use it. Do not use it with keys. If you want keys, use some other technique such as an associative array.
I have managed two large open-source projects (50 languages, 5000 translations), one using the key approach and one using the gettext approach - and I would never use the key approach again.
The cons include propagating changes in English text to the other langauges. If you change
msg_no_food = "We had no food left, so we had to eat the cats"
to
msg_no_food = "We had no food left, so we had to eat the cat's"
The new text has a completely different meaning - so how do you ensure that other translations are invalidated and updated?
You mentioned having long text that makes your scripts hard to read. The solution to this might be to put these in a separate script. For example, put this in the main code
print help_message('help_no_food')
and have a script that just provides help messages:
switch ($help_msg) {
...
case 'help_no_food': return gettext("We had no food left, so we had to eat the cat's");
...
}
Another problem for gettext is when you have a full page to translate. Perhaps a brochure page on a website that contains lots of embedded images. If you allow lots of space for languages with long text (e.g. German), you will have lots of whitespace on languages with short text (e.g. Chinese). As a result, you might have different images/layout for each language.
Since these tend to be few in number, it is often easier to implement these outside gettext completely. e.g.
brochure-view.en.php
brochure-view.de.php
brochure-view.zh.php
I just answered a similar (much older) question here.
Short version:
The PO file format is very simple, so it is possible to generate PO/MO files from another workflow that allows the flexibility you're asking for. (your devs want identifiers, your translators want words)
You could roll this solution yourself, or use a cloud-based app like Loco to manage your translations and export a Gettext file with identifiers when your devs need them.
As far as I know there are many ways:
Directly in the code: this could work only if the application doesn't need to be internationalized, but it's not the best, I think.
In the localization files: I've run into the problem that when I internationalize a model, and I have buttons like Create %{model}, if the model has more than one word, it may look awkward if only the first letter is capitalized.
In the code using humanize or titleize: It may lead to capitalization of sentences like Create And Continue, capitalizing the And when you could want to say something like Create and Continue or Create and continue.
Trough CSS: I thought this was the best place because capitalization is part of the style of the page (or not?) and it's similar to use humanize or titleize but you still have the same problems than these.
I've tried them and I've had difficulties with all of them. Especially because there exists acronyms that shouldn't be transformed to lowercase and articles that looks a little ugly when capitalized.
Also, sometimes you want to use the same words but capitalized them different. In this case should be better to use two different entries in the locale files or use 3 or 4 to change them?
When using the 4th option I found difficult to write tests because the html has everything lowercased but it's not really like that. Cucumber doesn't parse CSS to check the style of words.
What is wrong with putting it in the localization? Put each text as you want it to appear on the site, and you're set. If you write the text yourself, there should be no need to programmatically mangle it afterwards.
As for the models: Put human readable names into the translations for each of them. Also, if you think they need to be capitalized differently in some place - capitalize just the model name, not the entire button text.