General purpose translations collection - translation

Is anybody know a free collection of general purpose simple phrase transaltions for different languages?
I mean, is there any multi language database with simple phrases like "Yes", "No", "Forgot password?", "Resend email", "Sign up", etc?
Maybe just the way i search is wrong, but that kind of database is would be handy for anyone who want to go (let's say) a website multilingual. But I can't find one.
Ok, actually i find some by searching for "multilingual phrase collection" eg.: http://www.uazone.org/multiling/unilang/index.html but it's has only 20 phases and not a database (not structured to process programmically)

You might use the OPUS project from the University of Helsinki collecting multilingual sentences has data from GNOME KDE Ubuntu and OpenOffice (plus plenty of other sources like movie subtitles or EU legislation).

Related

Conjugating words when localizing an application

When translating an application into Spanish (and consequently many other languages), what is tense and conjugation should used for buttons that are verbs? (ex. "Submit", "Save", etc.)
I've converted my phone into Spanish to see what other apps do. It seems like some are using the infinitive of the verb, while others are conjugating it to the third person present tense.
I would think that using the third person present tense is the best way to go. I think that's what's done in English. However, I'm no grammar wiz, so I don't know if that's true or not.
I can't speak Spanish. Your approach can be correct. I would recommend you check MS style guides which are in English and available online.
Microsoft Style Guide Library
However, I wanted to attract your attention to a point about UI labels.
In the buttons such as Submit, Save, Cancel, most of the time they are translated as imperative since you give command to the device in these buttons (you ask device to "send", "cancel". On the other hand, when the device asks you to do something, in some languages, imperative is seen as offensive like Turkish. So you have use polite form of imperative. Maybe it makes difference in your case (such as usted vs ustedes)
I hope my answer helps you.

Open Ears API says every sound it hears is a word, even a cough

I am trying to use Open Ears for small part of my app. I have three or four keywords that I want to be able to "listen" to. Something like "Add", "Subtract", etc. I am just using the sample app found here. I want to have a special case in the app when I here "Add" etc. as opposed to a word that is not one of my four keywords. Right now I set my language to be only the four keywords, but whenever the Open Ears API hears anything, it picks between my four keywords. So if I cough, it picks the closest word out of the four words
How can I listen for a specific word without always choosing one of the keywords?
I was thinking I could have a whole bunch of words, a few hundred, and just check which word was spoken, and have a special case for my four keywords, but I don't want to have to type down each word. Does Open ears provide any default languages?
OpenEars developer here. Check out the dynamic grammar generation API that was just added in OpenEars 1.7 which may provide the right results for your requirements: http://www.politepix.com/2014/04/10/openears-1-7-introducing-dynamic-grammar-generation/
This approach might be more suitable for keyword detection and detection of fixed phrases. Please bring further questions to the OpenEars forums if you'd like to troubleshoot them with me.

Detect when to use a vs an

I have a service that allows user's (admins) to change the terminology the site uses. My designer wants me to use the format "A Group". The problem is, for some terminology, it should be "An" not "A".
Is there any way to reliably detect which to use? What about localization?
I can brute force it and get 90% of the way by checking the first letter for consonant vs vowel. That won't work for all words though. And that doesn't cover any language except English.
In my opinion you've got only 2 ways:
1- You need to check the first letter and process all the sentence by checking its letters to see if there is any non-English letters.
2- Provide a dictionary of English nouns then you can easily check your word to find if it needs an "a" or "an".
Although the "a versus an" issue is very specific, what you're describing here is a natural language processing issue. Essentially you are being asked to write code that generates a grammatically correct piece of text.
I think you should try to to explain the implications to the designer, especially if you end up localizing in other languages. Your time is probably better spent working on your app's business logic than on language processing.

Crowdsourcing translation for mobile developers?

I am developing applications for mobile phones with different operating systems (Android, Symbian, iPhone). Applications are sold internationally so they need to be translated to different languages in addition to english version.
I assume most mobile developers do the translations using some paid external service each time. This approach does not look very cost-effective to me. Would it make sense to have a website where simple translations would be done using crowdsourcing (other developers)? Most strings in mobile applications are very simple and short, for example "OK, "Cancel", "Are you sure?", "Please enter your password". Also the same strings are used in hundreds of applications. Instead of paying for translating all strings, developers could save money by only buying their difficult application specific translations.
Does anyone agree with this idea? I have seen many opensource projects doing the translations succesfully using volunteers.
I just found solution for me. Many users find this question in Google so I think my post must be helpful:
This is solution for us: crowdin.com - agile localization solution for tech companies
Microsoft allows you to view their terminology database: https://www.microsoft.com/Language/en-US/Default.aspx
That covers about 90 languages and will get you the things you mention such as common button captions, etc.
The problem you are facing after that is to try to get only the strings translated that you want. Most translators are going to charge you a minimum number of words. And they are going to want the entire resource file (regardless if you translated them yourself or not). Makes sense because localizing a product means that they need to have the whole picture to ensure consistency, etc. Professional translators will probably not charge you for what they call 100% matches.
I would never ever trust the translation of my product to crowd sourcing. Ever. You get what you pay for. Besides, just because you speak a language natively doesn't mean that you can write well, etc.
How do you check the crowd sourcing translation results for accuracy and quality? In a famous and documented occurrence recently the phrase "No lorries by this route please use the main road" was translated into "We are out of the office until Monday please contact us again then" and turned into road signs that were erected.
Crowd sourcing translation has been used and FaceBook is probably the largest company i know of that tried/used it. I have not tracked their progress but you could investigate it to see it's success or otherwise. Their method of quality checking was to get other people using the translations to vote for the one they preferred, so this was a case of crowd sourcing quality control. At this point the proposal that a camel is a horse designed by a committee jumps unbidden into my mind.
Translation, in spite of all the machine pumped into it, is still more of an art than a science. To translate correctly you need to have a native speaker translating from another language into their own. So for English to German you need a native German speaker who can speak English very well to do it. Within the profession very, very few translators will translate to a language in which they are non native. The reasons for this are many but boil down to the colloquial nature of language.
To be positive you could look at how Facebook fared and follow that route. Another route would be to approach not translators, but a translation agency, there are quite a number of these. Present them with the whole corpus you want translating in the original English and get them to quote you for the whole job. This would mean someone else manhging the job and the quality and they may have shortcuts, especially if the translations are to fairly standard "computerese" type phrases. i.e.'Home', 'Back', 'Next', 'Click here' etc.

Need some input on how to build a large scale text replacement system

My Rails app deals a lot with data from third-party APIs (specifically UPS, FedEx, DHL, etc).
What I'd like to do is whenever that data comes in, replace certain phrases with customized phrases.
Example: "On FedEx vehicle for delivery" (which we get from the FedEx API), I'd like to replace with "Out for Delivery."
Is it best to replace the the text on its way in to the database? Or on output? (Talking from an end-user speed perspective)
I'm planning on storing these phrases in our database, so I'm assuming I'd just create a helper that pulls the phrases I want to replace and then run the strings through those using gsub and replace as necessary?
Any tips on making this efficient and easy to manage would be great.
For speed you should replace the phrases when they enter the database. If you do it on output you'll have to do it every time an user requests the data. It is quite obvious that doing it every time will put more load on the server.
You may, however, want to store the original phrases, in case you want to change the wording in the phrases you replace with.
Just a random idea, which might not be applicable depending on how your data is, but maybe you could leverage the i18n framework that's built into Rails for this. The original text could be viewed as a separate language called vendorspeak :-).

Resources