Meta Tag Keywords - keyword

If I have the keywords "french, toast" will french toast automatically be a keyword applied to my site? Or does it need to be specifically stated. "french, toast, french toast"

According to the HTML5 spec, the value "must be a set of comma-separated tokens, each of which is a keyword relevant to the page".
In the example you can see that a token can include a space (in "type face"):
<meta name="keywords" content="british,type face,font,fonts,highway,highways">
So for this example the algorithm would add the following items to the keywords list:
british
type face
font
fonts
highway
highways
Strictly speaking, these are not keywords:
font highway
british type face
face
…
So yes, you'd have to explicitly state combinations of keywords (if they make sense). "Stack" and "Overflow" is not equivalent to "Stack Overflow".
However, depending on the implementation of the consuming user-agent, this might not be needed, as some user-agents will probably throw all keywords into a basket anyway.

Related

Can I declare app-wide acronym substitution for voice-over?

Being a data-centric app, I have lots of german legal acronyms for german laws in my app. An english example would probably be "Cx" being read as "Constitution". I have a handful of those acronyms, they can be found in various text fields, all across my app.
Is there a way to declare app-wide voice-over pronunciation rules for these acronyms as a developer? I would rather not implement manual text substitution for 100+ UITextViews...
Research so far:
This document, talking about choosing systemwide acronym substitution, but as an end user. (under "Changing pronunciations in the Speech Dictionary")
UIAccessibilityReadingContent seems to be talking about a general Design-Pattern, not about word level optimisation of the output.
I’m unaware of an app-wide pronoun citation table for VO. An easy workaround is to use a custom subclass of UITe xtView and override accessibilityLabel to return the label with the substituted phrase for each language’s abbreviations.

Is it safe to localize the "#" (pound sign, hashtag) as an ordinal indicator? (e.g. "#1", "#2")

We have a list of numbered reports that we display in our app - the numbering is important since the number of the report indicates the order it was created in, and gives the user a unique identifier to discuss the report with other users.
In a coworker's pull request, they're displaying text in labels in the UI for the cells which represent the reports with something like this:
label.text = "#\(report.userFacingNumber)"
This has the effect of numbering the reports correctly in english (i.e. "#1", "#2", etc) but I'm not sure that this will make sense in other languages.
Is the pound sign universal for indicating the ordinal of a list item? If not, is there an example language where this doesn't localize correctly that I could use to prove my point?
Is there a way in Swift/Foundation to correctly localize this ordinal listing?
"Is the pound sign universal for indicating the ordinal of a list item?"
No it is not. Many languages use N° or other symbols.
See:
https://en.wikipedia.org/wiki/Numero_sign
https://en.wikipedia.org/wiki/Number_sign#Usage_in_North_America
"Is there a way in Swift/Foundation to correctly localize this ordinal listing?"
Maybe there is a NumberFormatter that does this correctly, but I am not sure.
Your best bet is likely to omit the symbol.

What products support 3-digit region subtags, e.g., es-419 for Latin-American Spanish?

What products support 3-digit region subtags, e.g., es-419 for Latin-American Spanish?
Are web browsers, translation tools and translators familiar with these numeric codes in addition to the more common "es" or "es-ES"?
I've already visited the following pages:
W3C Choosing a Language Tag
W3C Language tags in HTML and XML
RFC 5646 Tags for Identifying Languages
Microsoft National Language Support (NLS) API Reference
I doubt that many products like that exist. It seems that some main stream programming languages (I have tested C# and Java) does not support these tags, therefore it would be quite hard to develop programs that does so.
BTW. NLS API Reference that you have provided, does not contain region tag for any of the LCID definition. And if you think of it for the moment, knowing how Locale Identifier is built, there is no way to support it now, actually. Implementation change would be required (they should use some reserved bits, I suppose).
I don't think we will see support for region tags in foreseeable future.
Edit
I saw that Microsoft assigned LCID of value -1 and -2 to "European Union 1" and "European Union 2" respectively. However I don't think it is related.

If you have an application localized in pt-br and pt-pt, what language you should choose if the system is reporting only "pt" code?

If you have an application localized in pt-br and pt-pt, what language you should choose if the system is reporting only pt code (generic Portuguese)?
This question is independent of the nature of the application, desktop, mobile or browser based. Let's assume you are not able to get region information from another source and you have to choose one language as the default one.
The question does apply as well for more case including:
pt-pt and pt-br
en-us and en-gb
fr-fr and fr-CA
zh-cn, zh-tw, .... - in fact in this case I know that zh can be used as predominant language for Simplified Chinese where full code is zh-hans. For Traditional Chinese, with codes like zh-tw, zh-hant-tw, zh-hk, zh-mo the proper code (canonical) should be zh-hant.
Q1: How to I determine the predominant languages for a specified meta-language?
I need a solution that will include at least Portuguese, English and French.
Q2: If the system reported Simplified Chinese (PRC) (zh-cn) as preferred language of the user and I have translation only for English and Traditional Chinese (en,zh-tw) what should I choose from the two options: en or zh-tw?
In general you should separate the "guess the missing parameters" problem from the "matching a list of locales I want vs. a list of locales I have" problem. They are different.
Guessing the missing parts
These are all tricky areas, and even (potentially) politically charged.
But with very few exceptions the rule is to select the "original country" of the language.
The exceptions are mostly based on population.
So fr-FR for fr, es-ES, etc.
Some exceptions: pt-BR instead of pt-PT, en-US instead of en-GB.
It is also commonly accepted (and required by the Chinese standards) that zh maps to zh-CN.
You might also have to look at the country to determine the script, or the other way around.
For instance az => az-AZ but az-Arab => az-Arab-IR, and az_IR => az_Arab_IR
Matching 'want' vs. 'have'
This involves matching a list of want vs. a list of have languages.
Dealing with lists makes it harder. And the result should also be sorted in a smart way, if possible. (for instance if want = [ fr ro ] and have = [ en fr_CA fr_FR ro_RO ] then you probably want [ fr_FR fr_CA ro_RO ] as result.
There should be no match between language with different scripts. So zh-TW should not fallback to zh-CN, and mn-Mong should not fallback to mn-Cyrl.
Tricky areas: sr-Cyrl should not fallback to sr-Latn in theory, but it might be understood by users. ro-Cyrl might fallback to ro-Latn, but not the other way around.
Some references
RFC 4647 deals with language fallback (but is not very useful in this case, because it follows the "cut from the right" rule).
ICU 4.2 and newer (draft in 4.0, I think) has uloc_addLikelySubtags (and uloc_minimizeSubtags) in uloc.h. That implements http://www.unicode.org/reports/tr35/#Likely_Subtags
Also in ICU uloc.h there are uloc_acceptLanguageFromHTTP and uloc_acceptLanguage that deal with want vs have. But kind of useless as they are, because they take a UEnumeration* as input, and there is no public API to build a UEnumeration.
There is some work on language matching going beyond the simple RFC 4647. See http://cldr.unicode.org/development/design-proposals/languagedistance
Locale matching in ActionScript at http://code.google.com/p/as3localelib/
The APIs in the new Flash Player 10.1 flash.globalization namespace do both tag guessing and language matching (http://help.adobe.com/en_US/FlashPlatform/beta/reference/actionscript/3/flash/globalization/package-detail.html). It works on TR-35 and can look beyond the # and consider the operation. For instance, if have = [ ja ja#collation=radical ja#calendar=japanese ] and want = [ ja#calendar=japanese;collation=radical ] then the best match depends on the operation you want. For date formatting ja#calendar=japanese is the better match, but for collation you want ja#collation=radical
Do you expect to have more users in Portugal or in Brazil? Pick accordingly.
For your general solution, you find out by reading up on Ethnologue.

What things should be localized in an application

When thinking about what areas should be taken into account for a localized version of an application a number of things pop up right away:
Text display
Date and time
Units
Numbers and decimals
User input formats
LeftToRight support
Dialog and control sizes
Are there other things/areas to remember or keep in mind when building a localizable application? Are there any resources out there which provide a listing of best practices not just for text localization but for all things around localization?
After Kudzu's talk about l10N I left the room with way more questions then I had before and none of my old questions answered. But it gave me something to think about and brought the message "depends on how far you can/want to go" accross.
Translate text bodies with aforementioned things
Test all your controls for length/alignment in LTR/RTL, TTB(TopToBottom) BTT and all it's combinations.
Look out for special characters and encodings
Look out for combinations of different alignments (LTR, RTL, TTB, BTT) and how they effect punctuation and quotation signs.
Align controls according to text alignment (Hebrew Win has its start menu at the right
Take string lengths into account. They can overflow in other languages.
Put labels at the correct side of icons (LTR, TTB etc)
Translate language selection controls
No texts in images (can't be translated)
Translate EVERYTHING (headers, logos, some languages use different brand names, product names etc)
Does the region have a 24:00 or a 00:00 (changes the AM/PM that goes with it too)
Does the region use AM/PM or the 24:00 system
What calendar system are they using
What digit is for what part of the date (day, month, year in all its combinations)
Try to avoid "copying [number] files" equivalents. Some regions have different rules about changing words according to quantities. (This is an extremely complicated topic that I will elaborate on if desired)
Translate sentences, not words. Syntax rules are too complicated to put in your business logic.
Don't use flags for regions. Languages != countries
Consider what languages / dialects you can support (e.g. India has a gazillion of languages)
Encoding
Cultural rules (some western images displaying business woman can be near offensive in some other cultures)
Look out for language generalizations (e.g. boot(UK) != boot(US))
Those are the ones from the top of my head. The list just went on and on...
Don't forget the overhead of converting all documentation and help files.
a couple hints from my J2ME apps days:
don't translate separate words, translate whole phrases, even if there are matching repetitions. You'll later have to translate to a language where words have to be modified differently in different contexts and you may end up with an analog of "color: greenish"
Right2Lelf includes numbering of lists, alignment, and alternative scroll bars
Arabic languages write the same letter differently based on surrounding letters. You can't just print a string from a character buffer, you'll need a special control to output those or support from you platform
alphabetical sorting is HARD. No native Chinese could ever explain me the rules, but they will always spot wrongly sorted words. There appear to be a number of options to sort Chinese. I guess other languages may have the same problem

Resources