Crowdsourcing translation for mobile developers? - localization

I am developing applications for mobile phones with different operating systems (Android, Symbian, iPhone). Applications are sold internationally so they need to be translated to different languages in addition to english version.
I assume most mobile developers do the translations using some paid external service each time. This approach does not look very cost-effective to me. Would it make sense to have a website where simple translations would be done using crowdsourcing (other developers)? Most strings in mobile applications are very simple and short, for example "OK, "Cancel", "Are you sure?", "Please enter your password". Also the same strings are used in hundreds of applications. Instead of paying for translating all strings, developers could save money by only buying their difficult application specific translations.
Does anyone agree with this idea? I have seen many opensource projects doing the translations succesfully using volunteers.

I just found solution for me. Many users find this question in Google so I think my post must be helpful:
This is solution for us: crowdin.com - agile localization solution for tech companies

Microsoft allows you to view their terminology database: https://www.microsoft.com/Language/en-US/Default.aspx
That covers about 90 languages and will get you the things you mention such as common button captions, etc.
The problem you are facing after that is to try to get only the strings translated that you want. Most translators are going to charge you a minimum number of words. And they are going to want the entire resource file (regardless if you translated them yourself or not). Makes sense because localizing a product means that they need to have the whole picture to ensure consistency, etc. Professional translators will probably not charge you for what they call 100% matches.
I would never ever trust the translation of my product to crowd sourcing. Ever. You get what you pay for. Besides, just because you speak a language natively doesn't mean that you can write well, etc.

How do you check the crowd sourcing translation results for accuracy and quality? In a famous and documented occurrence recently the phrase "No lorries by this route please use the main road" was translated into "We are out of the office until Monday please contact us again then" and turned into road signs that were erected.
Crowd sourcing translation has been used and FaceBook is probably the largest company i know of that tried/used it. I have not tracked their progress but you could investigate it to see it's success or otherwise. Their method of quality checking was to get other people using the translations to vote for the one they preferred, so this was a case of crowd sourcing quality control. At this point the proposal that a camel is a horse designed by a committee jumps unbidden into my mind.
Translation, in spite of all the machine pumped into it, is still more of an art than a science. To translate correctly you need to have a native speaker translating from another language into their own. So for English to German you need a native German speaker who can speak English very well to do it. Within the profession very, very few translators will translate to a language in which they are non native. The reasons for this are many but boil down to the colloquial nature of language.
To be positive you could look at how Facebook fared and follow that route. Another route would be to approach not translators, but a translation agency, there are quite a number of these. Present them with the whole corpus you want translating in the original English and get them to quote you for the whole job. This would mean someone else manhging the job and the quality and they may have shortcuts, especially if the translations are to fairly standard "computerese" type phrases. i.e.'Home', 'Back', 'Next', 'Click here' etc.

Related

Any good (free) text-to-speech engines out there?

I've been scouring the SO board and google and can't find any really good recommendations for this. I'm building a Twilio application and the text-to-speech (TTS) engine is way bad. Plus, it's a pain in the ass to test since I have to deploy every time. Is there a significantly better resource out there that could render to a WAV or MP3 file so I can save and use that instead? Maybe there's a great API for this somewhere. I just want to avoid recording 200 MP3 files myself, would rather have this generated programatically...
Things I've seen and rejected:
http://www.yakitome.com/ (I couldn't force myself to give them my email)
http://www2.research.att.com/~ttsweb/tts/demo.php
http://www.naturalreaders.com/index.htm
http://www.panopreter.com/index.php (on the basis of crappy website)
Thinking of paying for this, but not sure yet: https://ondemand.neospeech.com/
Obviously I'm new to this, if I'm missing something obvious, please point it out...
I am not sure if you have access to a mac computer or not. Mac has pretty advanced tts built into the operating system. Apple spent a lot of money on top engineers to research it. It can easily be controlled and even automated from the command prompt. It also has quite a few built in voices to choose from. That is what I used on a recent phone system I put up. But I realize that this is not an option if you don't have a mac.
Another one you might want to check into is http://cepstral.com/ they have very realistic voices. I think they used to be open source but they are no longer and now you need to pay licensing fees. They are very commonly used for high end commercial applications. And are not so much geared towards the home user that wants their article read to them.
I like the YAKiToMe! website the best. It's free and the voices are top quality. In case you're still worried about giving them your email, they've never spammed me in many years of use and I never got onto any spam lists after signing up with them, so I doubt they sold my email. Anyway, the service is great and has lots of features for turning electronic text into audio files in different languages.
As for the API you're looking for, YAKiToMe! has a well-documented API and it's free to use. You have to register with the site to use it, but that's because it lets you customize pronunciation and voice selection, so it needs to differentiate you from other users.

Is it easy to make 2 language versions for iPhone (Japanese)?

I have a client who I am pricing an app for, however other than the English version they would also like a Japanese version. Has anyone had experience in a similar case, is there an easy way to do it? Do I need to create two versions, one English and one Japanese? If it were two Latin languages I could imagine it would be easier but Japanese write from top to bottom, right to left so this worries me.
You don't seem to know a lot about Japanese. They're perfectly accustomed to western-style left-right, top-down writing, especially due to the influence of computers. You can of course create separate views (views only, no need for separate apps) for Japanese that switch everything to top-down, right-left writing. But it's only a minority of apps that do that. In fact, the Daijirin Japanese-Japanese dictionary is the only example I know of.
Talk to your client what kind of Japanese localization he wants. Odds are, he just wants strings replaced. See #kelloti's answer.
As a general advise: Make sure you get a native translator/developer who can guide you in a good localization. Don't simply copy-paste in strings you get from somebody else that you have no idea how to even read. This only produces terribly localized versions.
Read the Apple documentation on internationalization. I don't think you should have many issues with Japanese (how else would they sell phones in Japan?)

Source code not English; which (natural) language to display to the user?

I'm creating an English translation for a program written in German (i.e. all strings within tr("...") are German). Users who are in a non-English non-German locale will probably want to see the English translation, but with the program as it is now they will see German.
There are some ways to solve this problem:
Check if it's a German locale and force to English otherwise.
Present an option to the user.
Make the programmers change their source code to English.
What is considered best-practice for internationalizing where the source code is not in English?
These are two separate questions.
The best practice is to not use any kind of hard-coded string in the sources.
Strings should be stored in external files and loaded by ID.
But what you have there does not sound like the best practice. Might be too much work to get it there.
What you describe (the tr("...") stuff) sounds like gettext (or something similar).
That approach for gettext (and similar libraries) is that "the stuff in the sources is the ultimate fallback", used if the strings for the desired language are not present.
In this case I would go with "Present an option to the user."
You can't assume the user knows English.
Real example: in Switzerland the official languages are Italian, German, French and Romansh. If I ask for French and it is not present, then the next best option is probably German, not English. I Canada the official languages are French and English, so if I as for French and is not available, the next best option is probably English.
I think the best option is asking the user (during installation probably).
Change the source to English is too costly and not worth it. I live in Brazil, we have tons of codes in Portuguese and translating to English wan't necessary one time (we do make software to english speakers). Unless you have a client that requires you to do so (usually when you are selling the source also).
Hope it helps
OK, so I guess the three options are:
Recompile the program with translated strings.
This is fraught with danger as you'll end up with two copies of the source. Bug-fixes in one will need to be done in the other. And then, what happens if you need French? Italian? Spanish? The only advantage of this approach is that it's feasible for a non-developer to do the work. (Just about.)
Resource out the strings, and automatically check what the UI locale is on load.
Here the strings are replaced with GetResource("key") or similar. On load the program automatically translates to the user's culture. This might work, but I know plenty of German-speakers who have English-language culture installed on their PCs but who would prefer German language programs at some points.
Resource out the strings and give the user the choice on load
In general it's always best to give the user control. This might be a prompt on load, although if the application is used often this can be an annoyance. Perhaps a balance is to ask the user during installation for their preference and then give then an option in a dialog to later change this setting.
Note, by the way, that translation is not localisation. For instance: number formats are quite different in Germany (e.g. 1.233,44) from English (e.g. 1,233.44). Icons and suchlike often have national characteristics.

Online Translation

I am trying to develop an online translation service (sort of a personal challenge) but I have been looking for any guidelines or any way to see how it should be done and so far I have come up with nothing so. In a nutshell, does anybody knows where to find a service, code or explanation of how online translation works and/or guidelines for making one?
You could take a look at a similar project: Machine Translation
For a "personal challenge" this project seems way too big. You would need a huge dictionnary and very sophisticated translation algorithms.
Or are you asking if there are APIs to existing translation services?
Decent online translation services work as follows:
Email company with text to translate
They get humans to translate it.
Company sends translated text back in another email
At some point in the above, money exchanges hands.
Automated translation services tend to not work well, due to the huge amount of information required to translate text other than just the text itself, and issues that arise when there isn't an accurate translation for something between 2 languages.
This is a big undertaking. For personal use I use google translate. It does not do a great job, but enough I can get a decent understanding. At work we use COMIDOC, a fairly expensive commercial service. Its not perfect and we have to do a lot of work setting up specialized translations of technical sentences.
You can have a look at the codes of Spanish English, which is an online translation site.

Browser language: autodetect vs user select?

I am designing a localized web app. I am leaning on auto-detect browser language setting. But I notice a number of respectable sites asking the user to select a language. Is there any usability issue you know of (from actual experiences out there) with just auto-detecting user language?
Thanks.
Give me a choice
Remember my choice
Use the auto-detect as default
Make transition easy
In many situation I prefer or even need the "original" over my local one, bad translations or different content being the major reason.
If you register multiple domains, you can base your auto-detect on that: When foo.com redirects me to foo.de, or otherwise shows me a german interface, it is actively ignoring my choice to go to foo.com.
MSDN did insist on showing me atrocious automatic translations and ALWAYS made me click to go to the readable, understandable english one (that's a step up: when they introduced it, the default selection for changing the language was something like Afrikaans).
Make transition easy: i.e. make it easy to go to the counterpart of the current page in a different language. Amazon often succeeds when I change ".com" to ".de", but then it fails to lead me to the german translation of the item. That's not always possible, as that requires each local view having the same structure and a 1:1 page mapping. But generally, you have to weight above requirements against other constraints of the project.
[edit] MSDN got better now :)
I would suggest to autodetect the language and display the site in this language or the default languge (probably english) if the translation is not available. Additionally present the user with a selection of languages on top or bottom of your page. The names of the languages should be written in the target language.
Don't do it like that: English, German, Italian.
But: English, Deutsch, Italiano.
Obviously there is the usability problem that you might detect a language that the user doesn't understand. How are you going to do the detection? Don't think everybody has their browser set to the correct language. IP-Adresses are also a very bad indicator for the users language.
Practical example: YouTube tried to convince me for a week or so to use the Japanese version, though I can't read Japanese. Not very helpful. Microsoft is also determined to serve me automatically translated versions of there documentation when I just want to read the English one.
So don't try to tell your users which language they're supposed to prefer, let them decide for themselves.
I really hate non-configurable auto-detection because a lot of applications are translated more than imperfectly. I would rather read perfect English than bad Russian. For example, some terms do not translate in a reasonable way, and trying to translate everything makes localized version faintly ridiculous.
Also some applications can not translate new features fast enough, leading to a mixed language.
So I always prefer to have a choice, and choose the version that is native to the application author -- for the best language (unless it is a language I do not know).
Update:
One situation when it has gone beyond ridiculous is DB2 (or its client tools, not sure), which forced me to install a Russian version, but all errors in this version were shown as "???????? ??? ??? ??".
Yes: at work, we have a Windows XP deployed with 'English' language (because we have worldwide site and only one kind Windows to deploy with only one kind of settings when it comes to language).
Yet all out applications must run in French. The auto-detect feature alone would not be enough for an appropriate display of the labels.
Sometimes when you are trying to describe something to a user over the phone and you are in a different location, it is very annoying when you are both looking at the same URL, but see different results. You might even go so far as to include the language in the URL similar to how wikipedia does it (e.g. en.wikipedia.org).
Also sometimes a user will be on a friend's computer and try to access a website but won't see it in their preferred language, because of the language settings on the computer.
I think the best solution would be to allow the user to override the setting, but default it to the auto-detected language.
I agree that the auto-detect is not enough.
Not many users know the settings for selecting their language. Therefore the settings will often be the default and therefore incorrect (for non-english users).

Resources