Incorrect translations when implementing pluralization in ICU message format. I'm using react-intl (format.js) - localization

It seems like using pluralization in ICU message format is not dependable when combined with static text.
Take the following example:
Please see the {itemCount, plural,
one {message}
other {messages}
} in your inbox.
In Spanish, the word "the" changes depending on whether or not it's plural. The correct translations in Spanish should be:
one: "Por favor vea el mensaje en su bandeja de entrada"
other: "Por favor vea los mensajes en su bandeja de entrada"
The way to solve this would be to include "the" in the conditional.
Developers are obviously not going to be fluent in every language their app supports. If you don't speak Spanish, you wouldn't know to include "the" in the conditional.
It seems irresponsible to allow this type of syntax. Format.js even promotes using pluralization like this in their examples: https://formatjs.io/docs/core-concepts/icu-syntax/#plural-format
It seems to me that an error should be thrown when attempting to combine static text with dynamic plural text. The entire sentence should be required in the conditional.
My question is: Am I missing something? Should I prohibit my developers from entering values like this?
I can't be the first person that noticed this. Is there a way to enforce this through a setting in Format.js?

Looking at the documentation it looks as though your confusing the {selectordinal} with the message.
By way of an example the Spanish format based on what you have provided would be:
Por favor vea {itemCount, plural,
one {el mensaje}
other {los mensajes}
} en su bandeja de entrada.
Even so that is a relatively simple example as you may also have additional words later in the sentence affected by ordinal number, gender, or both so you may have to have multiple arguments in the sentence.
Declension is quite the fun linguistic topic for developers to get their heads around.
Edit: To add to this, that would give you something like:
const messages = {
en: {
INBOX: 'Please see the {itemCount, plural,
one {message}
other {messages}
} in your inbox.',
},
es: {
INBOX: 'Por favor vea {itemCount, plural,
one {el mensaje}
other {los mensajes}
} en su bandeja de entrada.',
},
}
So for the Spanish it is necessary to move the 'the' into the argument, but not for the English. It is really down to the syntax and the person creating localised messages to utilise that syntax for whatever gendered or ordinal conditions they need to be aware of.

Related

How can I advanced search for tweets on engine with the word 'casa' but not having 'câsa' results included?

I try this method:
casa -câsa
But that way it excludes the casa without accents too, then the search returns blank.
To the best of my knowledge, Twitter flattens-out all accented latin letters and treats them the same, so...a = á = â = à = ä = ā = ã = å.
One possible way to clean a little bit your search results is to use Twitter's advanced search language operator lang:[xx] in negation -lang:[xx], where [xx] represents the 2 letter ISO language code of the languages which might be using that particular letter (assuming you wish to filter-out from the results).
In your example, the letter Ââ (circumflex) is used by the following languages: Sami, Romanian, Vietnamese, French, Frisian, Portuguese, Turkish, Walloon and Welsh. Assuming you wish to filter-out results from these specific languages, your Twitter search query would look like this:
"casa" -lang:se -lang:ro -lang:vi -lang:fr -lang:fy -lang:pt -lang:tr -lang:wa -lang:cy
try it...
Alternatively, you can use the same lang:[xx] operator to limit Twitter's search results to one specific language (for example - English):
"casa" lang:en
try it...
This might not be a water-tight solution but it can reduce a lot of false positives.
Finally, you should keep in mind that Twitter is not guaranteeing accuracy in their machine-identification of languages.

How to add name in bib file in latex

I have a name like Catherine de Palo Drid . I want its reference like
Drid C de P
I add like that
author={Drid, Catherine, de, Palo}
I have changed many times the arrangement, but neither works.
Can anyone help?
THANKS
From the BibTeX documentation (btxdoc.pdf, p. 15/16):
Each name consists of four parts: First, von, Last, and Jr;
BibTeX allows three possible forms for the name:
"First von Last"
"von Last, First"
"von Last, Jr, First"
You want to treat "Catherine de Palo" as First and "Drid" as Last, since abbreviating names is only done in First. In that case I would use
author = {Drid, Catherine {de} Palo}
where the braces around "de" tell BibTeX to not alter that token.

Rails 5: 'humanize' phrases that start with an accented character doesn't capitalize them

Localization is working fine, as per the official documentation. However I have found that Rails' humanize method doesn't correctly capitalize the first character of a sentence if it's accented.
For example, if I have in config/locales/fr.yml:
fr:
about_me: "à propos de moi"
... and in the view:
<%= t("about_me").humanize %>
... the output in the browser is
à propos de moi
... whereas it should be
A propos de moi
If I change the à to a, humanize works as expected.
Note that in French accents on capital letters are sometimes omitted but let's leave that aside. I'd be happy with:
À propos de moi
Do I just need to hardcode the capital letters in the YAML files to work around this? Naturally I'd prefer not to resort to this.
I'd recommend doing it in the YAML, capitalization depends on the language and the context; for example, German capitalizes all nouns but English only capitalizes proper nouns and the accent issue in French that you're already aware of, the relationship between ß and SS in German, etc. Getting things right though simple-minded string manipulation is very error prone. You're better off treating human-readable strings as opaque and immutable pieces of data that you pull out of your I18N/L10N string database and give to the user as-is.
This is more work but being correct is sort of important.

Titleize with roman numerals, dashes, apostrophes, etc. in Ruby on Rails

I'm simply trying to convert uppercased company names into proper names.
Company names can include:
Dashes
Apostrophes
Roman Numerals
Text like LLC, LP, INC which should stay uppercase.
I thought I might be able to use acronyms like this:
ACRONYMS = %W( LP III IV VI VII VIII IX GI)
ActiveSupport::Inflector.inflections(:en) do |inflect|
ACRONYMS.each { |a| inflect.acronym(a) }
end
However, the conversion does not take into account word breaks, so having VI and VII does not work. For example, the conversion of "ADVISORS".titleize is "Ad VI Sors", as the VI becomes a whole word.
Dashes get removed.
It seems like there should be a generic gem for this generic problem, but I didn't find one. Is this problem really not that common? What's the best solution besides completely hacking the current inflection library?
Company names are a little odd, since a lot of times they're Marks (as in Service Mark) more than proper names. That means precise capitalization might actually matter, and trying to titleize might not be worth it.
In any case, here's a pattern that might work. Build your list of tokens to "keep", then manually split the string up and titleize the non-token parts.
# Make sure you put long strings before short (VII before VI)
word_tokens = %w{VII VI IX XI}
# Special characters need to be separate, since they never appear as "part" of another word
special_tokens = %w{-}
# Builds a regex like /(\bVII\b|\bVI\b|-|)/ that wraps "word tokens" in a word boundary check
token_regex = /(#{word_tokens.map{|t| /\b#{t}\b/}.join("|")}|#{special_tokens.join("|")})/
title = "ADVISORS-XI"
title.split(token_regex).map{|s| s =~ token_regex ? s : s.titleize}.join

Handling grammatical gender with Gettext

I'm looking for a simple-proper-elegant way to handle grammatical gender with Gettext in a Rails application, the same way plurals are handled with n_() method.
This has no interest in english, since words don't vary with gender, but it does when translating into spanish. His / her is a good use case in english. This is really needed when translating into spanish.
An example:
Considering users Pablo (male) and Ana María (female).
_('%{user} is tall') & {:user => user.name}
Should be translated to
'Pablo es alto'
'Ana María es alta'
Of course, we have access to user.gender
Any ideas?
Cheers!
Using standard gettext features this can be solved using contexts. Like calling appropriate:
p_('Male', '%{user} is tall')
or
p_('Female', '%{user} is tall')
This will generate two separate strings in gettext catalogs marking them with "Male" and "Female" contexts.
Unfortunately no. This is a limitation of the gettext system--aside from number, linguistic features are based on the language you key off of. If you were to key all of your strings in Spanish, it would work.
Another option would be to append a character to the string for translation's sake, and then strip it off.
I'm not familiar with Ruby, but the basic idea in psuedo-code would be:
if (user.sex == male) {
strip_last_char(_('%{user} is tall♂') & {:user => user.name})
} else {
strip_last_char(_('%{user} is tall♀') & {:user => user.name})
}
What about using the plural form mechanism of gettext. Usually a parameter n is used to distinguish between singular and plural forms.
Now imagine to use n to define your gender instead of an amount. Thus p.ex. n=1 means female (and not singular) and n=2 (or n>1) means female (and not plural).
n = user.male? ? 1 : 0
n_('%{user} is tall', '%{user} is tall', n) & {:user => user.name}

Resources