I have a local json file with some descriptions of an app and I have found a weird behaviour when parsing \u0092 and \u0091 characters.
When json file contains these characters, the corresponding parsed NSString is printed like "?" and in UIlabel it dissapears completely.
Example "L\u2019H\u00e9r." is showed as "LHér." instead of "L'Hér."
If I replace this characters with \u2019, then I can see the caracter ' in UILabel
Does anybody any clue about this?
EDIT: For the moment I will substitute both of them with character \u2019, it is also a ' and there is no problem confusing it with a control character. Thank you all!
This answer is a little speculative, but I hope it gets you on the right tracks.
Your best bet may be to give up and substitute \u0091 and \u0092 for something else as a preprocessing step before string display. These are control characters and are unprintable in most encodings. But:
If rest of the file is proper UTF, your json file probably has problems: encoding is wrong (CP-1250?) while you read the file as UTF, some error has been made when converting the file, or a similar issue. So another solution is of course fixing your file.
If you're not sure about how your file is encoded, it may simply be encoded in CP-1250 - so reading the file using NSWindowsCP1250StringEncoding might fix your problem.
BTW, if you hardcode a string #"\u0091", you'll get a compilation time error Universal character name refers to a control character. Yes, not even a warning, it's that much unprintable in Unicode ;)
Related
Note: I have been redirected to this website, as it believed to be the appropriate option for questions like this. If this is not the correct website, could someone please just let me know where I can find help?
I'm trying to write my program in Pycharm, but for some annoying reason whenever I try to type \, it shows up as ¥.
Here's a screenshot:
this is actually supposed to say print('\n'). Whatever has happened has changed all the \ to ¥ in all my files!
And, yes, I have tried copying and pasting the \ but it just ends up changing into ¥
So, could someone please let me know how to fix this??
This could be happening because you are using a font, particularly a Japanese don't. Change the font to an English font like Arial.
If that doesn't work you can use the Unicode backslash in Unicode and ASCII it is encoded at U+005C
I would like to convert convert Casa_Batll%C3%B3 to Casa_Batllȯ.
NSLog(#"Converting String:%#",[#"Casa_Batll%C3%B3" stringByReplacingOccurrencesOfString:#"%c3%b3" withString:#"ȯ"]);
Using this code, i get only known latin characters or some special characters but not unknown latin characters or special characters. Actually i am getting the string from database which is already created so i don't know about those strings in this database. I have also tried using NSString+HTML.m in this MWFeedParser. But i didn't get anything. I have also seen these link1 and link2. Please help anyone to me.
Use stringByReplacingPercentEscapesUsingEncoding:.
NSLog(#"Converting String:%#",[#"Casa_Batll%C3%B3" stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding]);
Adjust the encoding as appropriate.
I parsed an XML file containing UTF8/latin characters (é, â, è, î, etc...).
At first I tried to fix this with a function replacing the wrong chars. But I'm having a problem with à, replaced by ".
And as I don't want to replace all the " of my file, I have to find another way to fix it.
Any idea to fix this ?
Thanks a lot for your advices
To finally answer this question, it worked using TBXML. But only with UTF-8 encoding, not ISO-8859-1.
I have the following xml that I would like to read:
chinese xml - https://news.google.com/news/popular?ned=cn&topic=po&output=rss
korean xml - http://www.voanews.com/templates/Articles.rss?sectionPath=/korean/news
Currently, I try to use a luaxml to parse in the xml which contain the chinese character. However, when I print out using the console, the result is that the chinese character cannot be printed correctly and show as a garbage character.
I would like to ask if there is anyway to parse a chinese or korean character into lua table?
I don't think Lua is the issue here. The raw data the remote site sends is encoded using UTF-8, and Lua does no special interpretation of that—which means it should be preserved perfectly if you just (1) read from the remote site, and (2) save the read data to a file. The data in the file will contain CJK characters encoded in UTF-8, just like the remote site sent back.
If you're getting funny results like you mention, the fault probably lies either with the library you're using to read from the remote site, or perhaps simply with the way your console displays the results when you output to it.
I managed to convert the "ä¸ç¾" into chinese character.
I would need to do one additional step which has to convert all the the series of string by using this method from this link, http://forum.luahub.com/index.php?topic=3617.msg8595#msg8595 before saving into xml format.
string.gsub(l,"&#([0-9]+);", function(c) return string.char(tonumber(c)) end)
I would like to ask for LuaXML, I have come across this method xml.registerCode(decoded,encoded)
Under that method, it says that
registers a custom code for the conversion between non-standard characters and XML character entities
What do they mean by non-standard characters and how do I use it?
I'm trying to read and save Chinese characters written in websites !
For example:
html source code has this line:
title="网络歌手"
when I read this as NSString, the value returned is in the format like:
\UT0212\UT0999
something like that.
I have tried converting using gb2312 and utf-8, etc. encoders, but I don't quite get the exact Chinese. Sometimes I get close to Chinese, but not the exact words.
Any help is appreciated !
Regards,
Suraj
http://www.pinyin.info/tools/converter/chars2uninumbers.html
I believe you would have to convert the characters to unicode...similar to what they did in the above article