Language codes reference - localization

Can someone please give me the official reference to the language (country/region) codes. I'm finding different codes for the same language (es_ES, esp_ESP, etc.) and I can't figure out which one is the right one.

There are several different standards specifying language codes, including ISO-639 with its sub-standards 1-3 and IETF language tags, which describe more of a system of possible codes than the codes themselves.
Which standard is "the right" standard depends on your use case and context. See

That's because the languages naming coding has different standards, using different number of letters. You might have to chose which standard to use and maybe detect which standard the data source you have is using.
This is a starting point:

These codes are a combination of the specific language as well as the conuntry in which the language is used. So for instance means es_ES spanish_Spain. Another one would be es_AR which would mean spanish_Argentina. For the language code there's the Language Matrix, as for the localisation part you could use the ISO 3166-2 country reference

you can find all the regions code in documentation here


Android localization/translation

I have a keyboard app designed for Serbian language. My keys have labels based in Serbian cyrillic alphabet. My xml strings that are used for those labels are enclosed in <xliff:g></xliff:g> tags, but a certain provider on a certain type of a phone still translates these into a different language. Just in case, I also have my strings in language specific folders, but it still happens. Does anyone know if there is a way I could disable translating of all my strings any other way?
Chromecast localized name?

I'm writing an app that will display the word "Chromecast" in a menu. Makes sense, right?
But my app is localized to a few different languages. Does Google have a resource where I can look up any localized trade name for Chromecast? Does it change at all between locales?
More generally, does this kind of thing exist for other brand/trade names?
Chromecast itself shouldn't be translated, it is a proper name and should remain the same. However, there are some other expressions and terminology that are commonly used while you are casting and we have a spreadsheet to help you with that: go to this doc and look at the subsection "Cast terminology translations"

Detect when to use a vs an

I have a service that allows user's (admins) to change the terminology the site uses. My designer wants me to use the format "A Group". The problem is, for some terminology, it should be "An" not "A".
Is there any way to reliably detect which to use? What about localization?
I can brute force it and get 90% of the way by checking the first letter for consonant vs vowel. That won't work for all words though. And that doesn't cover any language except English.
In my opinion you've got only 2 ways:
1- You need to check the first letter and process all the sentence by checking its letters to see if there is any non-English letters.
2- Provide a dictionary of English nouns then you can easily check your word to find if it needs an "a" or "an".
Although the "a versus an" issue is very specific, what you're describing here is a natural language processing issue. Essentially you are being asked to write code that generates a grammatically correct piece of text.
I think you should try to to explain the implications to the designer, especially if you end up localizing in other languages. Your time is probably better spent working on your app's business logic than on language processing.

Source code not English; which (natural) language to display to the user?

I'm creating an English translation for a program written in German (i.e. all strings within tr("...") are German). Users who are in a non-English non-German locale will probably want to see the English translation, but with the program as it is now they will see German.
There are some ways to solve this problem:
Check if it's a German locale and force to English otherwise.
Present an option to the user.
Make the programmers change their source code to English.
What is considered best-practice for internationalizing where the source code is not in English?
These are two separate questions.
The best practice is to not use any kind of hard-coded string in the sources.
Strings should be stored in external files and loaded by ID.
But what you have there does not sound like the best practice. Might be too much work to get it there.
What you describe (the tr("...") stuff) sounds like gettext (or something similar).
That approach for gettext (and similar libraries) is that "the stuff in the sources is the ultimate fallback", used if the strings for the desired language are not present.
In this case I would go with "Present an option to the user."
You can't assume the user knows English.
Real example: in Switzerland the official languages are Italian, German, French and Romansh. If I ask for French and it is not present, then the next best option is probably German, not English. I Canada the official languages are French and English, so if I as for French and is not available, the next best option is probably English.
I think the best option is asking the user (during installation probably).
Change the source to English is too costly and not worth it. I live in Brazil, we have tons of codes in Portuguese and translating to English wan't necessary one time (we do make software to english speakers). Unless you have a client that requires you to do so (usually when you are selling the source also).
Hope it helps
OK, so I guess the three options are:
Recompile the program with translated strings.
This is fraught with danger as you'll end up with two copies of the source. Bug-fixes in one will need to be done in the other. And then, what happens if you need French? Italian? Spanish? The only advantage of this approach is that it's feasible for a non-developer to do the work. (Just about.)
Resource out the strings, and automatically check what the UI locale is on load.
Here the strings are replaced with GetResource("key") or similar. On load the program automatically translates to the user's culture. This might work, but I know plenty of German-speakers who have English-language culture installed on their PCs but who would prefer German language programs at some points.
Resource out the strings and give the user the choice on load
In general it's always best to give the user control. This might be a prompt on load, although if the application is used often this can be an annoyance. Perhaps a balance is to ask the user during installation for their preference and then give then an option in a dialog to later change this setting.
Note, by the way, that translation is not localisation. For instance: number formats are quite different in Germany (e.g. 1.233,44) from English (e.g. 1,233.44). Icons and suchlike often have national characteristics.

free to use, in a programmer-friendly format, dictionaries for european languages

I want to experiment with an idea I have of automatically localizing software, or at least suggesting a reasonable translation if a localized string is not available.
I'm not sure this will be working satisfactorily tomorrow morning but I just wanted to play with this idea.
Does anybody know of a dictionary that is free to use, and is in an easy to parse format, that can help me automatically translate words from English to other European languages (French, German, Spanish, etc)
The FreeDict project has quite a few relatively complete dictionaries. Most are from one language to english or vice versa, but some are between two non-english languages as well.
I don't know any dictionary but would like to point something out. You have to bear in mind that translating is not a direct word to word technique in any sense. The Rules of the language change as well and thus leave sentences unreadable. This is why even companies like Google have trouble making good translation software. Context is very hard to programmatically detect and context means everything in choosing the right word, the right structure and so on.
Maybe use a Translation API, if there is one. Google only seem to do a JavaScript API for Language.
You can't even expect to get a reasonable translation with an automatic method. Translating full texts is too hard for a computer to handle completely correct, translating short phrases correctly is impossible.
Take for example the simple text "Open", without a context it's not even possible to tell if it's a verb or an adjective. I know that at least in german that the verb and the adjective translates into two different words.
Also, computer specific concepts often borrow words from similar concepts outside the computer sphere. Those concepts often have a specific translation, but an automatic translation would sometimes try to translate it as if it was the original meaning, which can give you very strange translations.
After a while of searching i solved the problem by myself start to create my own dictionary. I do a lot of translations in my free time. In the beginning it is really boring work...but after a while you get an really good dicitionary. Some friends of mine using it we all benefit from every new Word we translate.
