NSDictionary writeToFile encoding only in UTF-8? - ios

By putting a NSDictionary to a file I get an UTF-8 encoded XML file. I need to write data to a file in NSISOLatin1StringEncoding. Is NSDictionary UTF-8 only? How to achieve my goal?

Are you sure you need a file encoded as ISO Latin-1? The problem with all encodings other than some form of Unicode is that they can't represent all possible characters.
The encoding is surely the least of your problems. A dictionary's file representation is a property list file. It's unlikely that any code which requires Latin-1 encoding would understand that format. Indeed, the format is not guaranteed. It's not even guaranteed to be XML or textual. Property lists may be binary.
If you want to exchange data with a program that's going to use anything other than Cocoa's property list implementation, you should manually write the contents of the dictionary out in a format that's defined independently of Apple's property list format.
And, yes, if Cocoa does write the property list as XML, it's going to be UTF-8-encoded.

Related

charachter encoding in PHP Extension

I'm currently writing a PHP extension in C++ with the Zend API. Basically I make PHP_METHOD{..} wrappers around my native C++ interface methods and using "zend_parse_parameters(..)" to fetch the corresponding input arguments.
This extension contains methods which can take strings as arguments, such as a filename.
I know from http://php.net/manual/en/language.types.string.php#language.types.string.details that strings have no encoding in PHP, but still can I expect from the PHP programmer that he will use a function like "utf8_decode(..)" such that the input strings can be read by the extension correctly?
Or does the PHP Programmer expect that the extension detects the encoding from the php-script and handles strings accordingly?
Every help is highly appreciated! Thanks!
You are correct. Strings are just binary blobs in PHP. As the author of an extension. Your options:
Have the user hand your extension UTF-8: By far the best option. The user has to make the decision. Assert that the string is UTF-8 encodable and fail early.
Encode yourself: You cannot know the meaning of the string. As PHP strings are just binary blobs and have no encoding information you do not know what the intended string content is. It might as well just come from a Windows file with weird encoding and was concatenated with a complete different encoding. Worse, it might be UTF-8 encodable, but actually not UTF-8, in which way you interpret it wrongly, without the user knowing. Hence, solution 1, have the user pass UTF-8.
Alternative: Force the user to pass an input encoding.
Here is an example of the alterantive 3:
$obj = MyExtensionClass('UTF-8'); // force encoding
$obj->someMethod($inputStr); // try to convert now
The standard library uses approach 1. See json_encode as an example:

How to change encoding in Restkit for IOS from Utf8 to Wincp1251

I get response from server in wincp1251 and restkit returns nil to my mapped object strings. I know restKit have a property defaultHTTPEncoding in RKClient(https://github.com/RestKit/RestKit/commit/0ead8a922219ec42ec6dae6ebe59139a1fd589ae), how can I use this and can it helps me?
I'm assuming that your server is returning JSON. If this is the case then the server needs to be updated because it isn't conformant to the JSON spec. Specifically:
Encoding
JSON text SHALL be encoded in Unicode. The default encoding is
UTF-8.
An important point to note is that RestKit doesn't unpack the response into a string, because the JSON deserialisation takes a data object (NSJSONSerialization). And again, the spec states:
The data must be in one of the 5 supported encodings listed in the JSON specification: UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE. The data may or may not have a BOM. The most efficient encoding to use for parsing is UTF-8, so if you have a choice in encoding the data passed to this method, use UTF-8.
So to handle your server response, if you can't change it, you'll need to handle the download yourself, convert the data to the appropriate encoding, unpack the JSON, and then create a mapping operation to use that.

Base64 vs NSPropertyListSerialization

I need to encode my image into text.
And I found this class for that:
Base64 for iOS with ARC
When I try to encode my image I see that NSPropertyListSerialization creates absolutely the same string as base64 does. Is it the right way to create base64 String with NSPropertyListSerialization or am I missing something?
Base64:
[data base64EncodedString];
NSPropertyListSerialization:
[NSString stringWithUTF8String:[[NSPropertyListSerialization dataWithPropertyList:data format:NSPropertyListXMLFormat_v1_0 options:0 error:nil] bytes]]
No you're not missing anything. Base64 is simply a standard from encoding binary data in ASCII and pLists use Base64 encoding for encoding binary data like images (using NSPropertyListSerialization) so they should create identical Base64 strings for the same binary data.
If you're wondering about which to use in your application I'd recommend you use the base64 library. While Apple has pushed to make pLists a standard and pLists will probably always encode binary data as Base64 in future, in the extremely unlikely event they change something or drop support for pLists your code will break. Besides it's best to be clear in your code (for yourself and others) that you're encode your data to base 64.

lua reading chinese character

I have the following xml that I would like to read:
chinese xml - https://news.google.com/news/popular?ned=cn&topic=po&output=rss
korean xml - http://www.voanews.com/templates/Articles.rss?sectionPath=/korean/news
Currently, I try to use a luaxml to parse in the xml which contain the chinese character. However, when I print out using the console, the result is that the chinese character cannot be printed correctly and show as a garbage character.
I would like to ask if there is anyway to parse a chinese or korean character into lua table?
I don't think Lua is the issue here. The raw data the remote site sends is encoded using UTF-8, and Lua does no special interpretation of that—which means it should be preserved perfectly if you just (1) read from the remote site, and (2) save the read data to a file. The data in the file will contain CJK characters encoded in UTF-8, just like the remote site sent back.
If you're getting funny results like you mention, the fault probably lies either with the library you're using to read from the remote site, or perhaps simply with the way your console displays the results when you output to it.
I managed to convert the "中美" into chinese character.
I would need to do one additional step which has to convert all the the series of string by using this method from this link, http://forum.luahub.com/index.php?topic=3617.msg8595#msg8595 before saving into xml format.
string.gsub(l,"&#([0-9]+);", function(c) return string.char(tonumber(c)) end)
I would like to ask for LuaXML, I have come across this method xml.registerCode(decoded,encoded)
Under that method, it says that
registers a custom code for the conversion between non-standard characters and XML character entities
What do they mean by non-standard characters and how do I use it?

Rails 3 dealing with special characters

I want to provide user with ability to fill-in input field with special characters (i.e. ¥ and others).
User input could be saved in xml file and later fetched and rendered back to form input.
What is the best practice of saving special symbols to xml (maybe using html entities or hexadecimal form)?
Thanks for advance.
I'd say if you save the file in utf-8 you will have no problems.
If some controller/view has problems with encoding you have to place this in the first line:
# encoding: utf-8
There's nothing special about them and you can don't need to encode them. Let your XML library deal with that, XML supports unicode ever since, and what you call "special symbols" are just unicode characters.

Resources