Mediawiki automatic "translation", creation of new pages - translation

Required is creation of automatically "translated" (using in-house set of rules of transforming the old language rules to new ones) pages in Mediawiki framework.
Example: page with contents "gylden" ("old English" language) has to be automatically "translated" to "golden" (resulting in "contemporary english" page regeneration if modified). Both pages with "old English" and "contemporary english" have to be accessible using wikipedia-style subdomains and the slugs have to be the same (e.g. en.site.com/slug1 and old-en.site.com/slug1).
The question is if there is any addon that can ease the pain of creation in-house "translation" module or what is the best strategy to start from scratch given the problem description?
Manual "translation" (creation of the pages) can not be a solution due to numerous reasons.
ps. The actual problem is converting pre-reformed Russian cyrillic data to contemporary one and involves dictionary look-ups as well as direct substitution of old characters.

You have to add your own Language subclass with a LanguageConverter, which handles transliteration and any sort of transformation you want.
See a language converter in action on Serbian: /sr-ec/Главна_страна vs. /sr-el/Главна_страна.
Wikipedia-style subdomains are for independent wikis, while you want a single wiki with different representations. Of course you could configure the subdomains to be rewritten to the language converter "directories", but there is no built-in support for that AFAIK.

Related

Localization in the Business Layer

We are writing an ASP.NET MVC application and every once in a while we need to add a string to a description or a note that is not generated by the UI to a database record. For example, if we need to reverse a transaction, we will prepend the word 'Reverse' to the description of the original transaction.
What is the best way to go about localizing these strings we have to add every now and again? On the web project we are using resource language files so everything is really taken care of by the .net framework. Can you get a class library project (the business layer) to take advantage of the automatic localization like a web project as well?
What we usually do is having additional resource files in the back-end projects. Usually they don't grow that big, so I think it's safe to do that. Usually, you can then access those resources like this:
string dummy = Properties.ResourceFileName.Reverse;
If you add a resource file in Visual Studio, the IDE will take care of generating the required code-behind to make that work.
For your example, I would suggest you having a string like this:
Reverse {0}
Then replace {0} with the actual transaction description. This way the translator will be able to move {0} before "Reverse" in case the specific language requires it. This is just an example, but best practices suggest us to avoid concatenating localizable strings as they may break in different languages. Example:
string dummy = string.Format(Properties.ResourceFileName.Reverse, transactionDescription);

Localizing Static Text asp.net mvc3.What is the approach?

Learning asp.net mvc and I am building a small website that will be initially in 2 languages.
10 or more pages are static pages with bold bits etc...
What is the best approach for localising these pages?
Is there a way to do it without creating a page for language that would be a no in my book.
How do you handle localisation of static pages in asp.net mvc? In asp.net there was some sort of localise control.
Any suggestions?
The way to do this is using resource files. You create an resource file for your default language and then one for each other language your site should run in.
This article describes how to do it. For example if you want english (default) and french you could create two resource files (.resx files) Website.resx and Website.fr-FR.resx. The first file for your default language, which is english and the second file for french. Both files exists from a key-value pair.
EDIT: Another interresting article describing the same idea can be found here.
I think that the best approach is still creating two files for them,
cause they are not static pages forever, they may change in future,
Or you can generalize the solution and save the text in database
for those two languages and render the correct content base on the selected culture.
They can be saved in the sense that they are pages on newsroom,
whenever you add a news you enter the text for both cultures.

Localization Strategy

We're currently debating two strategies of localization:
A. Have an XML file for the structure of the business objects with a Localized key to a separate CSV file for the translation.
for eg. /Resources/Schema.xml
in a separate CSV file: we have all key/value pairs for the translations:
/Resources/Localized.txt
Model_Title, Title, Title (in French), ...
This way, when the structure changes, we just change XML once while the LocalizedKey's are in place.
B. Have separate XML files for each language based on Culture.
eg. have two files:
/Resources/en-US/US-Localized.xml
/Resources/fr-AU/AU-Localized.xml
This way, they will have same schema but separate files. Therefore the user would have to make sure that the schemas are the same as they would need to change it twice as opposed to Option #1 where they can just change it once.
However, the readability here is much better since the user would not have to track the key the make the changes.
What are your thoughts/ideas on the strategies I suggested?
Thanks,
It is not clear about the environment -- web? desktop? internal enterprise integrated something-or-other? Is there any particular reason you aren't using whatever i18n framework your tool chain supports (gettext, .NET resource files...)?
In general I'd say you want to separate out resources by culture (but to be honsest, fr_AU should be rare) to have better maintainability and do not have to load the entire file for all per-culture-versions in many situations. This is especially true if your number of supported languages/cultures goes into the dozens or more.
However, it would be important to accommodate XML schema changes. The XML could be auto-generated, from simpler structures (key-value, either in a database or files) and validated via a common schema.
This is whether (as commenters noted) you are providing localized products or customers can create their own localizations.
In general, you should consider existing tools, rather than start from the scratch.
In .net we are using Data Driven ASP.NET Localization
Resource Provider and Editor Created by rick strahl

Language Strings in URLs

What would be a good way to handle URLs on a website that offers multiple languages, but has one primary language (in my case, English).
What should be the address of the home page in English? http://example.com/? http://example.com/en/? http://example.com/english/? Other?
What should be the address of the home page in another language, say, German? http://example.com/german/? http://example.com/de/? http://example.com/deutsch/?
Would the use of language-specific subdomains be appropriate? What would you do and why?
It kind of depends on the structure of your site:
If every language is considered a completely different site, use sub-domains for the language.
This is because different sub-domains is considered different sites by many technologies. Wikipedia does this (http://de.wikipedia.org/) to separate content for different languages entirely.
I wouldn't recommend you to choose this option unless your site is very big.
If every language has its own structure, but is still considered to be versions of the same site, use a top-level "directory" for languages.
For the sake of consistency, I would say that you should also have one for the default language (and omitting it would cause a redirect to the appropriate structure.) I would recommend you to use /en/, /de/, etc. since it's short and concise, and also the standard way of indicating languages.
This is probably your best bet.
If the structure of the site is identical no matter what language it is, and only content on the pages changes depending on the language, you could also consider putting the language modifier as a parameter: /home?lang=en
Google does this, for example: http://www.google.com/search?hl=de&q=foo (they also separate languages by TLD, though.)
Away from the question of how the international URLs should be styled (as that has been covered adequately already)...
One thing that I would personally do is make the site's 'main' domain (i.e. http://example.com) redirect the user appropriately depending on the Accept-Language HTTP header passed by the browser. This is what google.com does, for example.
If you do this, however, make sure that it's possible to switch to another language easily - and save the settings via some other mechanism to allow persistent override (cookies!).
What should be the address of the home page
Would the use of language-specific subdomains be appropriate?
How you like it, doesn't really matter. Design it to be intuitive to the users.
Language names encrypted in URLs won't matter for SEO because nobody will be searching for "en", "de". The names of the products you're offering however will matter very much, because people will be searching for products like "gifts" or "geschenke".
I think that the better stylish solution is to use the address in the format http://yourdomain.com as the home page URL, and identify the localized web pages with ISO 639-1 language codes

Having trouble implementing multilingual umbraco 3.0

How to implement the multilingual umbraco 3.0?
There are two different approaches to this.
The documentation on the Umbraco website describes how to do 1:1 multingual sites. This means that you have one site structure and different language tabs in a single document type for each translation od the content. This is then selected by using an on page selector on the website (a flag icon or the like).
Here's an example of a 1:1 site
This is the most efficient set up if you have lots of shared content i.e. the content and structure is exactly the same, the language is just different.
The second approach is to use separate page structures for each language, such as:
International Homepage
------> English Homepage
------------> English content page
------> French Homepage
------------> French content page
The advantage of this structure is that it is very easy to set up, but if you share lots of content it can be cumbersome to manage. It also has the advantage that you can lock the editing permissions down for country/language specific editors.
With the above structure you can also point individual URLs to the country pages.
Without knowing more about what exactly your requirements are it's hard to answer more fully as to which is the best approach. It may also be possible to create a hybrid solution.
Here are some links which may help:
http://forum.umbraco.org/yaf_postst2209_Multilingual-structure-in-umbraco.aspx
http://www.nibble.be/?p=32

Resources