I'm building a site which needs to support both English and Persian language. The site is built with ASP.NET MVC 3 and .NET 4 (C#). All my controllers inherit from a BaseController, which sets culture to "fa-IR" (during test):
Thread.CurrentThread.CurrentCulture = new CultureInfo("fa-IR");
Thread.CurrentThread.CurrentUICulture = new CultureInfo("fa-IR");
In my views, I'm using static helper classes to convert into right timezone and to format date. Like:
DateFormatter.ToLocalDateAndTime(model.CreatedOnUtc)
I'm doing the same for money with a MoneyFormatter. For money, the currency prints correctly in Persian (or, at least I think so as I don't know any Persian. It will be verified later on though by our translators). The same goes for dates, where the month is printed correctly. This is what it displays as currently:
For money:
قیمت: ريال 1,000
For dates:
21:01 ديسمبر 3
In these examples, I want the numbers to also print in Persian. How do you accomplish that?
I have also tried adding:
<meta http-equiv="Content-Type" content="text/html; charset=windows-1256" />
in the HEAD tag, as recommended on some forums. Does not change anything though (from what I can see). I have also added "fa-IR" as the first language in my browser languages (both IE and Firefox). Didn't help either.
Anyone got any ideas on how to solve this? I'd rather avoid creating translation tables between English numbers and Persian numbers, if possible.
Best regards,
Eric
After some further research I found an answer from a Microsoft employee stating that they don't translate numbers, though it's fully possible to do it yourself using the array of digits for a specific culture, that is provided by the NativeDigits property (see http://msdn.microsoft.com/en-us/library/system.globalization.numberformatinfo.nativedigits.aspx).
To find out if text that is submitted in a form is Arabic or Latin I'm now doing:
public bool IsContentArabic(string content)
{
string pattern = #"\p{IsArabic}";
return Regex.IsMatch(
content,
pattern,
RegexOptions.RightToLeft | RegexOptions.IgnoreCase | RegexOptions.Multiline);
}
public bool IsContentLatin1(string content)
{
string pattern = #"\p{IsBasicLatin}";
return Regex.IsMatch(
content,
pattern,
RegexOptions.IgnoreCase | RegexOptions.Multiline);
}
And to convert Persian digits into their "Latin" equivalents, I wrote this helper:
public static class NumberHelper
{
private static readonly CultureInfo arabic = new CultureInfo("fa-IR");
private static readonly CultureInfo latin = new CultureInfo("en-US");
public static string ToArabic(string input)
{
var arabicDigits = arabic.NumberFormat.NativeDigits;
for (int i = 0; i < arabicDigits.Length; i++)
{
input = input.Replace(i.ToString(), arabicDigits[i]);
}
return input;
}
public static string ToLatin(string input)
{
var latinDigits = latin.NumberFormat.NativeDigits;
var arabicDigits = arabic.NumberFormat.NativeDigits;
for (int i = 0; i < latinDigits.Length; i++)
{
input = input.Replace(arabicDigits[i], latinDigits[i]);
}
return input;
}
}
I've also hooked in before model binding takes place and there I convert digit-only input from forms into Latin digits, if applicable.
Ako, for what direction goes we managed to solve quite a bit of issues using the 'dir' attribute (dir => direction) on the html tag for our Web page. Like this:
<html dir="rtl">
The 'dir' attribute takes either "rtl" (right-to-left) or "ltr" (left-to-right).
Hope this helps someone!
Maybe you have to change the font.
If you know your client supports a particular font, say Badr or Nazanin, then you can change the font-family for those numbers, and I think that will work for you.
You can supply for the font for your page, but I think this only works in modern browsers and fails in the old ones. You can check this.
Hope that helps.
How are you printing the numbers to the output? I don't have much experience with localization but you might have to use the appropriate NumberFormatInfo or something similar to format the number.
Also for greatest portability you should probably be using UTF8 as the encoding.
firs of all you should use a font-face that supports Persian number.
I use this technique and I can show Persian number on my web site.
fonts like:BNazanin.
Related
I've tryed this solution here Jsoup set Accept-Language to make my JSoup connection only accept pages that come in english. Unfortunately it didn't work and my output was still:
I don't know if this happened because when the user agent says ("Accept-Language", "en") is to rather take english pages version instead of other languages or really for just accept the english ones. Anyway, the question speaks for itself, if there is a way to, how may I set the JSoup connection to only accept english pages?
Is it possible for you to write a language check method
public String languageCheck(String page){
String result = ""
List<String> pagesArray= new ArrayList<String>();
pagesArray.add(page);
if(pagesArray.get(0)!=null){
char []test = pagesArray.get(0).trim().toCharArray();
if(Character.isLetter(test[0]))
{
System.out.println("ok");
result = "okay"
}else{
System.out.println("Nok");
result = "Not Okay"
}
}
return result;
}
use this method to remove non English pages, hope it helps you
I have an application that builds an HTML email. Included in the content is an encoded URL parameter that might, for example, contain a promotional code or product reference. The email is generated by a Windows service (essentially a console application) and the link, when clicked is handled by an MVC web site. Here is the code for creating the email link:
string CreateLink(string domain, string code) {
// code == "xyz123"
string encrypted = DES3Crypto.Encrypt(code); // H3uKbdyzrUo=
string urlParam = encrypted.EncodeBase64(); // SDN1S2JkeXpyVW890
return domain + "/" + urlParam;
}
The action method on the MVC controller is constructed as follows:
public ActionResult Index(string id) {
string decoded = id.DecodeBase64();
string decrypted = DES3Crypto.Decrypt(decoded);
...
}
In all our testing, this mechanism has worked as expected, however, now we have gone live we are seeing around a 4% error rate where the conversion from base-64 fails with the following exception:
The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or a non-white space character among the padding characters.
The id parameter from the url 'looks' OK. The problem appears to be with the EncodeBase64/DecodeBase64 methods that are failing as DecodeBase64 method returns a 'garbled' string such as "�nl����□��7y�b�8�sJ���=" on the failed links.
Furthermore, most of the errors are from IE6 user agents leading me to think this is a character encoding problem but I don't see why.
For reference, here is the code for my base-64 URL encoding:
public static string EncodeBase64(this string source)
{
byte[] bytes = Encoding.UTF8.GetBytes(source);
string encodedString = HttpServerUtility.UrlTokenEncode(bytes);
return encodedString;
}
public static string DecodeBase64(this string encodedString)
{
byte[] bytes = HttpServerUtility.UrlTokenDecode(encodedString);
string decodedString = Encoding.UTF8.GetString(bytes);
return decodedString;
}
Any advice would be much appreciated.
To recap, I was creating a URL that used a base-64 encoded parameter which was itself a Triple DES encrypted string. So the URL looked like http://[Domain_Name]/SDN1S2JkeXpyVW890 The link referenced a controller action on an MVC web site.
The URL was then inserted into an HTML formatted email. Looking at the error log, we saw that around 5% of the public users that responded to the link were throwing an "invalid base-64 string error". Most, but not all, of these errors were related to the IE6 user agent.
After trying many possible solutions based around character and URL encoding, it was discovered that somewhere in the client's process the url was being converted to lower-case - this, of course, broke the base-64 encoding (as it is uses both upper and lower case encoding characters).
Whether the case corruption was caused by the client's browser, email client or perhaps local anti-virus software, I have not been able to determine.
The Solution
Do not use any of the standard base-64 encoding methods, instead use a base-32 or zBase-32 encoding instead - both of which are case-insensitive.
See the following links for more details
Base-32 - Wikipedia
MyTenPennies Base-32 .NET Implementation
The moral of the story is, Base-64 URL encoding can be unreliable in some public environments. Base-32, whilst slightly more verbose, is a better choice.
Hope this helps.
It looks like you were really close. You had an extra zero coming back from your encyrpted.EncodeBase64() function.
Try this:
string data = "H3uKbdyzrUo=";
string b64str = Convert.ToBase64String(UTF8Encoding.UTF8.GetBytes(data));
string clearText = UTF8Encoding.UTF8.GetString(Convert.FromBase64String(b64str));
This is an interesting issue. My guess is that IE 6 is eating some of the characters.
For example, the length of the string that you included "ywhar0xznxpjdnfnddc0yxzbk2jnqt090" is not a multiple of four (which is a requirement for FromBase64 to work http://msdn.microsoft.com/en-us/library/system.convert.frombase64string.aspx)
But if you were to pad that string until it's length is a multiple of four ("ywhar0xznxpjdnfnddc0yxzbk2jnqt090" + "a12") then that works.
The MSDN documentation says that one ("=") or two ("==") equal characters are used for padding to/fromBase64 methods and I suspect IE 6 is truncating that from the string that you send.
This is total speculation but I hope it helps.
.txt file looks like
Euro
US Dollar
Australian Dollar
Pounds Sterling
Swiss Franc
and so on.
I have tried things like XDocument and ObservableCollection but can't seem to get them to work.
I would rather not hard code so much into xaml.
thanks,
Assuming you're sending the file with the project (file Build action set to "Content"), here's what you need:
First, add a ListBox to the Page called CurrenciesListBox, then add this code on the page load event or constructor:
var xapResolver = new System.Xml.XmlXapResolver();
using (var currenciesStream = (Stream)xapResolver.GetEntity(new Uri("Currencies.txt", UriKind.RelativeOrAbsolute), "", typeof(Stream)))
{
using (var streamReader = new StreamReader(currenciesStream))
{
while (!streamReader.EndOfStream)
{
CurrenciesListBox.Items.Add(streamReader.ReadLine());
}
}
}
Remember to change the filename above to match your file!
This is for starter, there are better ways to do the job (using MVVM)!
The Problem:
I'm about to implement language localization to an already very large ipad application that was built using sencha touch wrapped in phonegap. I have the english and spanish translations in json files.
What I'm Planning on Doing:
I plan to load the json files into a sencha touch store, creating a global object. Then in each place where I call text that is displayed, i will replace the text with a call to the global object.
My question(s):
Is there an easier way to implement language localization with my
setup?
Will I run into issues with native sencha stuff (like datepickers)?
When loading/reloading language json files, will I have performance
issues (require a webview reload?, sencha object resizing issues,
etc)
edit 1 : Useful Related Info:
For those going down this road, it quickly becomes useful to write a simple phonegap plugin to get the ipad/iphone device's language setting into your javascript. This requires a plugin, which will look something like this:
Javascript :
part 1:
PhoneGap.exec("PixFileDownload.getSystemLanguage");
part 2(callback Function):
setLanguage(returnedLanguage)
{
GlobalVar.CurrentLanguage = returnedLanguage; //GloablVar.CurrentLanguage already defined
}
Objective C:
-(void)getSystemLanguage:(NSMutableArray*)paramArray withDict:(NSMutableDictionary*)options
{
/*Plugin Details
PhoneGap.exec("PixFileDownload.getSystemLanguage");
Returns Language Code
*/
NSUserDefaults* defs = [NSUserDefaults standardUserDefaults];
NSArray* languages = [defs objectForKey:#"AppleLanguages"];
NSString *language = [languages objectAtIndex:0];
NSLog(#"####### This is the language code%#",language);
NSString *jsCallBack;
jsCallBack = [NSString stringWithFormat:#"setLanguage('%#');",language];
[self.webView stringByEvaluatingJavaScriptFromString:jsCallBack];
}
edit 2: character encoding
When adding additional language characters to a sencha project (or any webview phonegap project), ensure that you have the proper encoding specified in the index file. This is the tag i needed.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
I've finished this language localization plugin. It's not amazing, but it worked better than I originally speculated. Here are the short answers to each of the questions.
1- Is there an easier way to implement language localization with my
setup?
Not that i'm aware of. The comment by Stuart provided this link Sencha-touch localization. Use a store or a global JSON object? which had some good information on one way that you can use class overrides. I didn't like this approach, because it spread my language translations into different classes. But certainly, if you are doing something simple, or you want something that could be more powerful, perhaps you should investigate that.
2- Will I run into issues with native sencha stuff (like datepickers)?
I ended up specifically leaving "datepickers" in english for now. But everything else was really relatively easy to customize. Almost every graphical UI element can have it's text altered.
3- When loading/reloading language json files, will I have performance
issues (require a webview reload?, sencha object resizing issues,
etc).
The method that i employed (see below) worked exceptionally well in regards to performance. The one issue that you have is right when you switch the languages, you need that specific page to reload. Sencha handled resizing without any flaws, except where I had been foolish and statically set sizes.
Some of what I did was described in edits to the question. Here is a detailed overview of my solution. (warning, it's not the most elegant solution.)
Instead of using a pure JSON file, I ended up just using a javascript function. This isn't the greatest solution because it requires some minimal maintenance, but JSON parsing with phonegap/sencha isn't the best. (I get JSON files from translater's, and quickly paste into the javascript file. Takes around 2 minutes, see further explanation below).
Language.js
function setLanguage(language)
{
if(language == "en")
{
//console.log("inside if Language == en");
GlobalLanguage.CurrentLanguage = language;
GlobalLanguage.ID = {"glossary": [
{
//CONVERTED JSON
about : 'About',
checking_for_updates : 'Checking for updates...(This may take a few minutes.)'
//Any additional translations
}
]};
}
if (language == "es"){
//console.log("inside language == es");
GlobalLanguage.CurrentLanguage = language;
GlobalLanguage.ID = {"glossary": [
{
//CONVERTED JSON
about : 'Acerca de ',
checking_for_updates : 'Verificando actualizaciones... (Capas que demore algunos minutos).'
//Any additional translations
}]};
}
if (language == "pt"){
//console.log("inside language == pt");
GlobalLanguage.CurrentLanguage = language;
GlobalLanguage.ID = {"glossary": [
{
//CONVERTED JSON
about : 'Sobre',
checking_for_updates : 'Verificando se há atualizações... (pode demorar alguns minutos.)'
//Any additional translations
}]};
}
}
As you can see, this file allows for 3 languages (Portugese, English, and Spanish). After setting the language, you could access each localized string anywhere in your object. For example, if you need to access the word "about" simply use:
GlobalLanguage.ID.glossary[0]["about"]
This will access the GlobalLanguage object which will have whatever language loaded into the properties. So throughout your project, you could have these calls. However, I'd recommend taking it one step further
function langSay(languageIdentifier){
// console.log("inside langSay");
if(!GlobalLanguage.ID.glossary[0][languageIdentifier]){
return "[! LANGUAGE EXCEPTION !]";
}
else{
return GlobalLanguage.ID.glossary[0][languageIdentifier];
}
}
This protects you from having language exceptions and having your program crash without knowing where (you may have hundreds or thousands of properties being set in that language.js file). So now simply :
langSay("about")
One additional note about formatting from JSON. The format you want your language files in is:
languageIdentifier : 'Translation',
languageIdentifier : 'Translation',
languageIdentifier : 'Translation'
I used Excel for the formatting. Also the languageIdentifiers are unique identifiers without spaces. I recommend just using Excel to format first 3 to 4 words word1_word2_word3_word4 of the english translation.
word1_word2_word3 : 'word1 word2 word3'
Hope this helps you out! I'd be happy to respond to any questions
I have an XML as input to a Java function that parses it and produces an output. Somewhere in the XML there is the word "stratégie". The output is "stratgie". How should I parse the XML as to get the "é" character as well?
The XML is not produced by myself, I get it as a response from a web service and I am positive that "stratégie" is included in it as "stratégie".
In the parser, I have:
public List<Item> GetItems(InputStream stream) {
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(stream);
doc.getDocumentElement().normalize();
NodeList nodeLst = doc.getElementsByTagName("item");
List<Item> items = new ArrayList<Item>();
Item currentItem = new Item();
Node node = nodeLst.item(0);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element item = (Element) node;
if(node.getChildNodes().getLength()==0){
return null;
}
NodeList title = item.getElementsByTagName("title");
Element titleElmnt = (Element) title.item(0);
if (null != titleElmnt)
currentItem.setTitle(titleElmnt.getChildNodes().item(0).getNodeValue());
....
Using the debugger, I can see that titleElmnt.getChildNodes().item(0).getNodeValue() is "stratgie" (without the é).
Thank you for your help.
I strongly suspect that either you're parsing it incorrectly or (rather more likely) it's just not being displayed properly. You haven't really told us anything about the code or how you're using the result, which makes it hard to give very concrete advice.
As ever with encoding issues, the first thing to do is work out exactly where data is getting lost. Lots of logging tends to be the way forward: create a small test case that demonstrates the problem (as small as you can get away with) and log everything about the data. Don't just try to log it as raw text: log the Unicode value of each character. That way your log will have all the information even if there are problems with the font or encoding you use to view the log.
The answer was here: http://www.yagudaev.com/programming/java/7-jsp-escaping-html
You can either use utf-8 and have the 'é' char in your document instead of é, or you need to have a parser that understand this entity which exists in HTML and XHTML and maybe other XML dialects but not in pure XML : in pure XML there's "only" ", <, > and maybe ' I don't remember.
Maybe you can need to specify those special-char entities in your DTD or XML Schema (I don't know which one you use) and tell your parser about it.