Get Encoding type of NSData - ios

I have some chunks of data that are encoded with random techniques, say first chunk is encoded by NSUTF8StringEncoding another one with NSASCIIStringEncoding or kCFStringEncodingWindowsArabic.
I don't know which chunk is encoded with which type of encoding. I have tried multiple options e.g. if result is nil then decode with NSNonLossyASCIIStringEncoding, but to no avail. Is there any way to determine a specific chunk of data is encoded with type of Encoding ?
Any help will be appreciated.

You can find the answer for your question here: https://stackoverflow.com/a/9836989/2923506
This is a copy&paste code adapted to ARC of the user MiiChiel, because it's a good answer. "if ASCII and UTF8 give both a string in return. For instance: UTF8 gives me some extra characters (negative result) and ASCII are showing the right characters (positive result)."
NSString *responseString, *responseStringASCII, *responseStringUTF8;
responseStringASCII = [[NSString alloc] initWithData:responseData encoding:NSASCIIStringEncoding];
if (!responseStringASCII)
{
// ASCII is not working, will try utf-8!
responseString = [[NSString alloc] initWithData:responseData encoding:NSUTF8StringEncoding];
}
else
{
// ASCII is working, but check if UTF8 gives less characters
responseStringUTF8 = [[NSString alloc] initWithData:responseData encoding:NSUTF8StringEncoding];
if(responseStringUTF8 != nil && [responseStringUTF8 length] < [responseStringASCII length])
{
responseString = [responseStringUTF8 retain];
}
else
{
responseString = [responseStringASCII retain];
}
}
I hope this can help you.

Objective C includes a built-in way to detect a the encoding of a string embedded in NSData.
(Note, if your case you still need to partition each chunk into a separate NSData objects.)
NSData* data = // Assign your NSData object...
NSString* string;
NSStringEncoding encoding = [NSString stringEncodingForData:data encodingOptions:nil convertedString:&string usedLossyConversion:nil];

Related

Encoding for converting between NSString to NSData and back

I'm trying to encrypt/decrypt an NSString and return the original string in the end. Here's how I convert the string to a data object:
NSData *string_data = [string dataUsingEncoding:NSUTF8StringEncoding];
And after that data has been encrypted/decrypted I want it back to the original string by doing:
NSString *to_string = [NSString stringWithCString:[decrypted_data bytes] encoding:NSUTF8StringEncoding];
The encoding seems to match, but I still get a null when I try to print out to_string to the console. I've tried all sorts of encoding settings. It doesn't seem to work.
Use:
NSString *to_string = [[NSString alloc] initWithData:string_data encoding:NSUTF8StringEncoding];
It is not safe to use stringWithCString because the bytes buffer you get from NSData is not guaranteed to be null-terminated.

JSONObjectWithData and umlauts

I am using a service (not mine) that supplies JSON-formatted data. When I try to parse the data with JSONObjectWithData:options:error:, it returns nil if there is an umlaut (รถ, for example). It works fine if there are no umlauts or other special characters.
The person running the service says the data is encoded as ISO-8859-1 (not UTF-8).
Is there anything I can do at my end to get such data to parse correctly?
Try with below piece of code:
NSError *error;
NSString *string = [NSString stringWithContentsOfURL:webURL encoding:NSISOLatin1StringEncoding error:&error];
NSData *utf8Data = [string dataUsingEncoding:NSUTF8StringEncoding];
id jsonObject = [NSJSONSerialization JSONObjectWithData:utf8Data options:kNilOptions error:&error];
if (error) {
//Error handling
} else {
//use your json object
}
if you have NSData with latin1 (ISO-8859-1) then you may want to convert it to UTF-8 first, like this:
const char latin1[1] = {196}; // iso-8859-1 umlaut character code
NSData *latin1Data = [NSData dataWithBytes:latin1 length:1];
NSString* utfstr = [[NSString alloc] initWithCString:latin1Data.bytes encoding:NSISOLatin1StringEncoding];
NSLog(#"%#",utfstr);

responseString is null

my webservice returns json string with new lines so It causes problem that responseString gives always null.
NSString *responseString=[[NSString alloc] initWithData:kampanyadata encoding:NSUTF8StringEncoding];
NSLog(#"%#",[NSString stringWithFormat:#"responsestring:%#",responseString]);
----responsestring:null
how can I replaces new lines character in JSON String?
NSString *tempString = [tempString stringByReplacingOccurrencesOfString:#"\n" withString:#" "];
I think you need to do this :)
Are you sure kampanyadata is not null? If it's not null try to use NSASCIIStringEncoding like this:
NSString *responseString=[[NSString alloc] initWithData:kampanyadata encoding:NSASCIIStringEncoding];.
Offtopic
And btw the NSLog(); method takes a NSString formatted already as parameter so you can use:
NSLog(#"responsestring:%#",responseString);
soapResults = [[NSString alloc]
initWithBytes: [webData mutableBytes]
length:[webData length]
encoding:NSUTF8StringEncoding];
try this this works fine for me
You got some NSData, and you tried to convert it to an NSString. There's no JSON involved at this point. Any errors have nothing to do with JSON or newline characters whatsoever. Possibilities: 1. The NSData that you received is nil. 2. The NSData that you received isn't in UTF-8 format.
Your NSLog statement is quite funny. Look at the definition of NSLog - the first parameter is a format string. Instead of
NSLog(#"%#",[NSString stringWithFormat:#"responsestring:%#",responseString]);
you should write
NSLog (#"responseString:%#", responseString);
And you can pass the JSON document directly to NSJSONSerializer. No need to convert it to an NSString. Actually, if the data is large, just a waste of valuable memory.

NSData to NSString encoding

I have an application that receives messages from server.
Those messages may contain cyrillic characters. But when I transform received data into NSString I obtain only "\u041c\u0430\u043a" symbols instead of cyrrilic ones.
NSData *responceData = ....;
NSString* responceString = [[NSString alloc] initWithData:responseData encoding:NSUTF8StringEncoding];
How may I get correct symbols?
There's a much easier solution.
If your data has literal unicode escape sequences in it (that is, \u041c\0430\043a as pure ASCII characters, with no unicode escaping applied), then this is not the UTF-8 encoding of that string. You want NSNonLossyASCIIStringEncoding.
NSData *responseData = ....;
NSString* responseString = [[NSString alloc] initWithData:responseData encoding:NSNonLossyASCIIStringEncoding];
responseString will now be exactly what you expect.

iOS : decode utf8 string

I'm receiving a json data from server with some strings inside. I use SBJson https://github.com/stig/json-framework to get them.
However when I output some strings at UILabel they look like this: \u0418\u043b\u044c\u044f\u0411\u043b\u043e\u0445 (that's Cyrillic symbols)
And it's all right with latin characters
How can I decode it into normal symbols?
Some code about getting data:
NSData * data = [NSURLConnection sendSynchronousRequest:request returningResponse:&response error:&error];
NSString *stringData = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
NSDictionary *object = [parser objectWithString:stringData error:nil];
NSString *comments = [NSString stringWithFormat:#"%#",[object valueForKey:#"comments"]];
String comments has a very special format, so I'm doing some operation like stringByTrimmingCharactersInSet ,
stringByReplacingOccurrencesOfString ,
NSArray* json_fields = [comments_modified componentsSeparatedByString: #";"];
to get a final data.
This is an example of received data after some trimming/replacing (it's NSString* comments):
"already_wow"=0;"date_created"="2012/03/1411:11:18";id=41598;name="\U0418\U043b\U044c\U044f\U0411\U043b\U043e\U0445";text="\U0438\U043d\U0442\U0435\U0440\U0435\U0441\U043d\U043e";"user_id"=1107;"user_image"="user_image/a6/6f/96/21/20111220234109510840_1107.jpg";"user_is_deleted"=0;username=IlyaBlokh;"wow_count"=0;
You see that fields text and name are encoded
If I display them on the view (at UILabel for example), they still look the same
maybe the string returned is just the unicode string representation (ascii string), that's means not returned the content encoded with utf8, to try this with NSASCIIStringEncoding to get stringData

Resources