Should I use unique ids or the whole sentences when localizing software? - localization

I need to translate a website to a couple languages, and I've already read how to do it:
Mark strings for translation
Generate messages file
Translate messages file
The problem is, if I use whole sentences as the message ids, in english, for example, then if later I decide to modify the text, I'll have to change it in the code and on each message file . Or I could just change the english translation, but then my english message file will look weird, with translations from english to english which do not match.
Example:
Original: "I don't know what to do."
Translation: "I'm not sure what to do."
An alternative is to use unique message ids such as:
Original: "INDECISION_MESSAGE"
Translation: "I'm not sure what to do."
The advantage is that I can change translations without changing the id and things will still be consistent. But then there is no easy way for a translator to know what the message should be like as there is no context except by looking at the code.
What would you use?

In PHP people normally use the first approach you mentioned. In ASP.NET the second.
I think it's more of a personal taste and a matter of the framework that you use.
Personally I prefer the second option. Since you normally already have the ID/translation pair somewhere in a list, the translator can just take that list to translate.
What framework do you use?

Related

What language is this Salesforce code that I need to wrap?

I'm working on a Salesforce coding issue. Let me preface this by saying I'm not a developer or Salesforce expert.
What language is this?
Data Type FormulaThis formula references multiple objects
IF (Fulfillment_Submission_Form_URL__c <> "" && CONTAINS(Fulfillment_Submission_Form_URL__c, "qualtrics"),
Fulfillment_Submission_Form_URL__c &
(IF (CONTAINS(Fulfillment_Submission_Form_URL__c,"?SID="), "&", "?")) &
(IF (CONTAINS(TEXT(Type__c), "Site Visit"),
"ContactId="&Statement_of_Work__r.Contractor_Contact__c&
"&CoachType="&SUBSTITUTE(Statement_of_Work__r.Work_Type__r.Name," ","%20")&
"&CoachName="&SUBSTITUTE(Statement_of_Work__r.Contractor_Name__c," ","%20")&
"&InitPartId="&Initiative_Participation__r.Id&
"&InstitutionName="&substitute(substitute(SUBSTITUTE(Institution_Name__c," ","%20"),")",""),"(","")&
"&AccountId="&Initiative_Participation__r.Participating_Institution__r.Id&
"&TodaysDate="&TEXT(TODAY())&
"&SOWLineItemId="&Id&
"&LeaderCollege="&Initiative_Participation__r.ATD_Leader_College_Status__c&
"&SVRCompleted="&TEXT(Count_of_Site_Visit_Fulfillments__c)&
"&SVRRequired="&TEXT(Number_of_Work_Units_Allocated__c),
IF (CONTAINS(TEXT(Type__c), "Feedback"),
"InitPartId="&Initiative_Participation__r.Id&
"&SOWLineItemId="&Id&
"&ReportYear="&Statement_of_Work__r.SOW_Year__c&
"&UserId="&Contractor_User_Id__c&
"&InstitutionName="&substitute(substitute(SUBSTITUTE(Institution_Name__c," ","%20"),")",""),"(",""),
"")
))
,"")
Essentially it's pulling a link from another product we've integrated it with. We then take the basic link and reformat it to add parameters.
The problem is when it pulls in some parameters (ex: CoachName) the Coach entered their name in strange formats like: John (Coach) Doe.
So when the script outputs a URL that includes parameters it breaks at the &CoachName=John%20(Coach)% portion of the URL. Any easy way to work around this by modifying the script? Unfortunately we DO need that (Coach) identifier because the system we push to grabs that as well.
It's formula syntax, I'd compare it to Excel-like formulas. There's self-paced training if you don't want to read documentation. And as it's not exactly code-related you may have more luck on dedicated site, https://salesforce.stackexchange.com/. More admins lurk there.
So you do want that "(Coach)" to go through but it breaks the link? Looks like ( is a special character. It's not technically wrong to have unescaped parentheses, if it breaks that other site you might want to contact them and get their act together. RFC doesn't force us to encode them but looks like you'll have to to solve it at least in the short term: https://webmasters.stackexchange.com/questions/78110/is-it-bad-to-use-parentheses-in-a-url
Instead of poor man's encoding (SUBSTITUTE(Statement_of_Work__r.Contractor_Name__c," ","%20") try using proper URLENCODE(Statement_of_Work__r.Contractor_Name__c).
Or there's bit more "pro" function called URLFOR but the documentation doesn't make it very clear how powerful the 3rd parameter is with the braces [key1 = value1, key2 = value2] syntax. Basically just pass the parameters and let SF worry about encoding special characters etc.
Read my answer https://salesforce.stackexchange.com/a/46445/799 and there are some examples on the net like https://support.docusign.com/s/articles/DFS-URL-buttons-for-Lightning-basic-setup-limitations?language=en_US&rsc_301

Force "vouvoiement" Google cloud translation API

I have a lot of dialog files to translate from English to French. Google cloud translation API works fine but sometimes the translation returns "tu" and other times it returns "vous". In a nutshell, "tu" is informal and singular, while "vous" is formal and/or plural.
How to force the vous (formal) in all translations?
I checked documentation about glossaries, but it does not seem to be supported.
It is supported in this translator, but their API is not free.
Just checked myself and I don't believe this is possible with the API to simply favour formality, unfortunately.
Your best bet, if you don't want to use an alternative service, is to replace all instances of "tu" with "vous".
To achieve this, without replacing "tu" inside other words, you'll want to match and replace only "tu" as a standalone word, ensuring the size of each match is a length of 2.
Otherwise, you may end up creating things like tulip -> vouslip.
Unfortunately vous/tu is not a direct translation of a single word so the dictionary approach doesn't seem viable.

How to get http tag text by id using lua

There is a webpage parser, which takes a page contains several tags, in a certain structure, where divs are badly nested. I need to extract a certain div element, and copy it and all its content to a new html file.
Since I am new to lua, I may need basic clarification for things might seem simple.
Thanks,
The ease of extraction of data is going to largely depend on the page itself. If the page uses the exact same tag information throughout its entirety, it'll be much more difficult to extract than it would if it has named tags.
If you're able to find a version of the page that returns json format, then you're that much better off. Here's a snippet of code on something I wrote to grab definitions from a webpage that did not have json format:
local actualword, definition = string.match(wayup,"<html.-<td class='word'>%c(.-)%c</td>.-<div class=\"definition\">(.-)</div>")
Essentially, this code searched down the page until it found the class "word", and took the word after it (%c is the pattern for control characters). It continued on to "definition" and captured that, as well.
As you can see, it's a bit convoluted, but I had the luck of having specifically named tags for what I wanted.
This is edited to fit your comment. As a side note that I should have mentioned before, if you're familiar with regular expressions, you can use its model to capture what you need. In this case, it's capturing the string in its totality:
local data = string.match(page, "(<div id=\"aa\"><div>.-</div>.-</div>)")
It's rarely the fault of the language, but rather the webpage itself, that makes it hard to data mine anything. Since webpages could literally have hundreds of lines of code, it's hard to pinpoint exactly what you want without coming across garbage information. It's why I prefer a simplified result such as json, since Lua has a json module that can encode/decode and you can get your precise information.

URL Layout for multi-lingual CMS

So I'm trying to add a new, shiny REST webservice to our CMS. It should follow the REST "roules" pretty closely, so it shall use GET/POST/PUT/DELETE and proper, logical URLs. My design is heavily inspired by the Apigee Best Practices.
The CMS manages a tree of categories, where each category can contain a number of articles. For multi-lingual projects, the categories are duplicated for each language (so any article is uniquely identifier by its ID and the language ID). The structure is the same across all languages, only the positions can vary (so category X contains the categories Y and Z in every language, but Y can be before Z in language 1 and the other way around in language 2). Creating a new article or category always also creates copies in all languages. Deleting works the same way, so an article is always deleted in all languages.
FYI: Languages are identified by their numeric ID or their locale (like en_US).
(Please image a leading /v1 in front of all URIs; I've skipped it because it adds nothing to this question.)`
My current approach is having an URL scheme like this:
GET /articles :( returns all articles in all languages
GET /articles/:id-:langid :) returns a single article in a given language
POST /articles :) creates a new article
PUT /articles/:id-:langid :) update a single article in a given language
DELETE /articles/:id :) delete an article in all languages
But...
How to identify languages?
Currently, the locale each language is assigned does not need to be unique, so two languages can in rare cases have the same locale. Using the ID is guaranteed to be unique.
Using the locale (most likely forced to lowercase) would be nice, cause the URLs are more readable. But this would
maybe lead to projects with overlapping locales.
require clients to first fetch the locales (but I guess a client would otherwise have to fetch the language IDs anyway, so that's kind of okay).
I would like to end up with exactly one solution to this and tend towards using the locale. But is it worth risking overlapping locales?
(You can image langid=X to be interchangeable with locale=xx_xx in the following.)
Should the language be a hierarchy element?
In most cases, clients of the API will want to fetch the articles for a given language instead of all. I could solve this by allowing a GET parameter and offer GET /articles?langid=42. But since is such a common usecase, I would like to avoid the optional parameter and make it explicit.
This would lead to GET /articles/:langid, but this introduces the concept that the language is a hierarchy level inside the API. If I'm going to do so, I would like to make it consistent for the other verbs. The new URL layout would look like this:
GET /articles/:langid :) returns all articles in a given language
GET /articles/:langid/:id :) returns a single article in a given language
POST /articles :( creates a new article in all languages
PUT /articles/:langid/:id :) update a single article in a given language
DELETE /articles/:langid/:id :( delete an article in all languages
or
GET /:langid/articles :) returns all articles in a given language
GET /:langid/articles/:id :) returns a single article in a given language
POST /articles :( creates a new article in all languages
PUT /:langid/articles/:id :) update a single article in a given language
DELETE /:langid/articles/:id :( delete an article in all languages
This leads to...
What to do with global actions?
The above layout has the problem that POST and DELETE work on all languages, but the URL suggests that they work only on one language. So I could modify the layout:
GET /articles/:langid :) returns all articles in a given language
GET /articles/:langid/:id :) returns a single article in a given language
POST /articles :) creates a new article in all languages
PUT /articles/:langid/:id :) update a single article in a given language
DELETE /articles/:id :) delete an article in all languages
This makes for a somewhat confusing layout, as the language level is only sometimes present and one needs to know a lot about the system's interna. On the bright side, this matches the system very well and is very similar to the internal workings.
Sacrifices?
So where do I make sacrifies? Should I introduce alias URLs to have nice URL layouts but do maybe unintended stuff in the backgroud?
I had the exact same problem some time ago and I decided to put local in front of everything. While consuming content it makes much more sense to read URLs like:
/en/articles/1/
/en/authors/2/
/gr/articles/1/
/gr/authors/2/
than
/articles/en/1/
/authors/en/2/
/articles/gr/1/
/authors/gr/2/
In order to solve the "no specific locale" issue I would use a keyword referring to all available locales, or a default locale. So:
/global/articles/
or
/all-locales/articles/
or
/all/articles/
to be honest I like global and all because they make sense reading them.
DELETE /global/articles/:id :) delete an article in all languages
hope I helped
My initial approach would have been to have language as an optional prefix in the URL, e.g. /en_us/articles, /en_us/articles/:id, but if you don't want to 'pollute' your URLs with language identifiers you can use the Accept-language header instead, in the same way that content negotiation is defined in RFC 2616. Microsoft's WebAPI is using this approach for format negotiation.

ActionScript Replace Between String?

I have a string named MESSAGE and it varies depending on what people say.
Is there a way I can make MESSAGE replace the contents of a string inside of a string?
For example, if MESSAGE is equal to "Hey guys [rainbow]look at my awesome rainbow text[/rainbow] isn't it cool?" then how do I only replace the "[rainbow]look at my awesome rainbow text[/rainbow]" part by getting rid of "[rainbow]" and "[/rainbow]" and replacing "look at my awesome rainbow text" with a string called RAINBOWTEXT?
The reason I need this is because I would like users in my Flash-based chat to be able to make rainbow text, but the method I use needs to do this client-side. I did have a PHP version that would submit the text into the database, but it would cut the message off due to the large amount of data. Each character got a <font color="#123456"> and a </font> on either side of it, so messages were super long.
If I can replace the inside part of what is specified by [rainbow] and [/rainbow], I can have each person's client replace the message once it gets it. In the database, it will have "Hey guys [rainbow]look at my awesome rainbow text[/rainbow] isn't it cool?" but in chat it will have "Hey guys look at my awesome rainbow text isn't it cool?" with the actual text rainbowfied.
This sounds like a job for a regular expression and the replace() method. Have a dig around for expressions which will find and replace the contents of HTML tags and you should be able to adapt what you find to suit your requirements. Examples for JavaScript should also be valid as ActionScript and JavaScript both adhere to the ECMAScript standard for regular expressions.
Once you've found something, you can play around with it with this handy tool for testing ActionScript 3.0 regular expressions.

Resources