Bytes are changed after encoding NSString into NSInputStream via NSData - ios

I run into the following problem when trying encoding an NSString as NSString -> NSData -> NSInputStream and then decode from NSInputStream with read method:
NSString *inputString = [NSString stringWithFormat:#"%c", 255];
NSData *data = [inputString dataUsingEncoding:NSUTF8StringEncoding];
NSInputStream *stream = [NSInputStream inputStreamWithData:data];
[stream open];
uint8_t bytes;
[stream read:&bytes maxLength:1];
NSLog(#"%i", bytes);
The output is 195 instead of 255. Why?

Because of the kind of encoding you used for the string. UTF-8 is a form of string encoding which will end up converting characters with values above 127 into multi-byte sequences. So although inputString contained a single character, your data object didn't actually contain a single byte as you may have assumed, but multiple (two, in this case) bytes. And when you read from the stream, you only read the first byte of the encoded data, but there was more there.
You didn't need to run the data through the input stream to see this result. Accessing the first byte of the NSData instance would have shown the same thing.
You say that this is a "problem" but you don't suggest what you're trying to actually accomplish. 255 isn't a printable/meaningful text character. If you want to transmit raw data bytes, you can do that directly, rather than using an NSString and string encodings. If you are transmitting strings, then it's already doing the right thing. You just need to be prepared that your data size can exceed your string "length".

Related

Null included in string imported from rs232

I have a sting that is appended to a text view from a rs232 device. Everything imports but there is some of the strings that get concatinated as if there is a null value attached to end and start of some of the data that is imported. Any idea on how to look for a null value in a string?
- (void)readBytesAvailable:(NSInteger)count {
DataSource *sharedManager = [DataSource sharedManager];
const int bufferLength = 1024;
uint8_t buffer[bufferLength];
NSString *s;
NSInteger bytesRead = [session read:buffer bufferLength:bufferLength];
if (bytesRead > 0) {
// Convert to a string - note that the remote device is sending only well- formed UTF8 text data (e.g. no binary data, no VT100 excampe codes, etc).
s = [[NSString alloc] initWithBytes:buffer length:bytesRead encoding: NSUTF8StringEncoding];
AllDataTextView.text = [AllDataTextView.text stringByAppendingString:s];
The way you are parsing the strings is wrong.
If you are reading bytes in chunks of 1024 bytes then you can easily read up only a part of some character (UTF8 characters can have length of 1, 2, 3 or even more bytes). Then when trying to parse such a chunk, you will get nil (null) because your buffer won't contain valid data (a partial character is invalid and it will make parsing fail).
The ideal solution would be to read all the bytes first and then convert them to text. Another solution is to detect partial characters (see UTF8 specification) and parse only the characters that are whole and keep the partial characters in the buffer until next iteration.
See Decoding partial UTF-8 into NSString

NSString initWithBytes with openssl encrypted data returns nil with NSUTF8StringEncoding

I'm having an issue where I'm trying to create an NSString from encrypted data created by OpenSSL and I keep getting nil for the string.
The code that I'm using to encrypt and decrypt the data is taken from the following link http://saju.net.in/code/misc/openssl_aes.c.txt
Now here is the code where I'm calling to encrypt my data ("aes_init" is of course called on my application init):
//-- encrypt saved data
int textLen = str.size();
char* buff = const_cast<char*>(str.c_str());
EVP_CIPHER_CTX encryptCtx;
unsigned char *ciphertext = aes_encrypt(&encryptCtx,
reinterpret_cast<unsigned char*>(buff),
&textLen);
NSString * nsstring = [[NSString alloc] initWithBytes:reinterpret_cast<const char*>(ciphertext)
length:strlen(reinterpret_cast<const char*>(ciphertext))
encoding:NSUTF8StringEncoding];
[nsstring autorelease];
UIApplication* clientApp = [UIApplication sharedApplication];
AppController* appController = ((AppController *)clientApp.delegate);
[appController saveData:nsstring]; //--> crash at this line
I've tried different Encoding (NSASCIIStringEncoding and NSUnicodeStringEncoding) and they don't crash but the data is completely wrong after I decode.
Any ideas on how to solve this issue?
Thanks :)
Ciphertext, the output of an encryption function, should be indistinguishable from random. Meaning that any byte value can be generated, including byte values that do not map to characters. Hence it is needed to encode the ciphertext, for instance using base 64 encoding.

convert unicode string to nsstring

I have a unicode string as
{\rtf1\ansi\ansicpg1252\cocoartf1265
{\fonttbl\f0\fswiss\fcharset0 Helvetica;\f1\fnil\fcharset0 LucidaGrande;}
{\colortbl;\red255\green255\blue255;}
{\*\listtable{\list\listtemplateid1\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\*\levelmarker \{check\}}{\leveltext\leveltemplateid1\'01\uc0\u10003 ;}{\levelnumbers;}\fi-360\li720\lin720 }{\listname ;}\listid1}}
{\*\listoverridetable{\listoverride\listid1\listoverridecount0\ls1}}
\paperw11900\paperh16840\margl1440\margr1440\vieww22880\viewh16200\viewkind0
\pard\li720\fi-720\pardirnatural
\ls1\ilvl0
\f0\fs24 \cf0 {\listtext
\f1 \uc0\u10003
\f0 }One\
{\listtext
\f1 \uc0\u10003
\f0 }Two\
}
Here i have unicode data \u10003 which is equivalent to "✓" characters. I have used
[NSString stringWithCharacters:"\u10003" length:NSUTF16StringEncoding] which is throwing compilation error. Please let me know how to convert these unicode characters to "✓".
Regards,
Boom
I have same for problem and the following code solve my issue
For Encode
NSData *dataenc = [yourtext dataUsingEncoding:NSNonLossyASCIIStringEncoding];
NSString *encodevalue = [[NSString alloc]initWithData:dataenc encoding:NSUTF8StringEncoding];
For decode
NSData *data = [yourtext dataUsingEncoding:NSUTF8StringEncoding];
NSString *decodevalue = [[NSString alloc] initWithData:data encoding:NSNonLossyASCIIStringEncoding];
Thanks
I have used below code to convert a Uniode string to NSString. This should work fine.
NSData *unicodedStringData =
[unicodedString dataUsingEncoding:NSUTF8StringEncoding];
NSString *emojiStringValue =
[[NSString alloc] initWithData:unicodedStringData encoding:NSNonLossyASCIIStringEncoding];
In Swift 4
let emoji = "😃"
let unicodedData = emoji.data(using: String.Encoding.utf8, allowLossyConversion: true)
let emojiString = String(data: unicodedData!, encoding: String.Encoding.utf8)
I assume that:
You are reading this RTF data from a file or other external source.
You are parsing it yourself (not using, say, AppKit's built-in RTF parser).
You have a reason why you're parsing it yourself, and that reason isn't “wait, AppKit has this built in?”.
You have come upon \u… in the input you're parsing and need to convert that to a character for further handling and/or inclusion in the output text.
You have ruled out \uc, which is a different thing (it specifies the number of non-Unicode bytes that follow the \u… sequence, if I understood the RTF spec correctly).
\u is followed by hexadecimal digits. You need to parse those to a number; that number is the Unicode code point number for the character the sequence represents. You then need to create an NSString containing that character.
If you're using NSScanner to parse the input, then (assuming you have already scanned past the \u itself) you can simply ask the scanner to scanHexInt:. Pass a pointer to an unsigned int variable.
If you're not using NSScanner, do whatever makes sense for however you're parsing it. For example, if you've converted the RTF data to a C string and are reading through it yourself, you'll want to use strtoul to parse the hex number. It'll interpret the number in whatever base you specify (in this case, 16) and then put the pointer to the next character wherever you want it.
Your unsigned int or unsigned long variable will then contain the Unicode code point value for the specified character. In the example from your question, that will be 0x10003, or U+10003.
Now, for most characters, you could simply assign that over to a unichar variable and create an NSString from that. That won't work here: unichars only go up to 0xFFFF, and this code point is higher than that (in technical terms, it's outside the Basic Multilingual Plane).
Fortunately, *CF*String has a function to help you:
unsigned int codePoint = /*…*/;
unichar characters[2];
NSUInteger numCharacters = 0;
if (CFStringGetSurrogatePairForLongCharacter(codePoint, characters)) {
numCharacters = 2;
} else {
characters[0] = codePoint;
numCharacters = 1;
}
You can then use stringWithCharacters:length: to create an NSString from this array of 16-bit characters.
Use this:
NSString *myUnicodeString = #"\u10003";
Thanks to modern Objective C.
Let me know if its not what you want.
NSString *strUnicodeString = "\u2714";
NSData *unicodedStringData = [strUnicodeString dataUsingEncoding:NSUTF8StringEncoding];
NSString *emojiStringValue = [[NSString alloc] initWithData:unicodedStringData encoding:NSUTF8StringEncoding];

how to save bytes to an NSString with UTF8 encoding

I have some NSData that I am passing in as bytes
const void *bytes = [responseData bytes];
Those bytes were originally UTF8 formatted, I am now trying to get them into a UTF8 NSString without messing with the encoding at all.
I have previously written this if that copies the bytes into a cstring which normally would be fine unless I have any non english characters in the bytes which take two byte instead of one. This means any international characters in my string get messed up when I copy them into a cstring.
Hence the reason for needing to copying the bytes directly into a UTF8 formatted object.. preferably a NSString.. if possible.
This is how I was handling the conversion which I later found out is wrong but will hopefully give you a good idea of what I am trying to achieve.
else if (typeWithLocalOrdering == METHOD_RESPONSE)
{
cstring = (char *) malloc(sizeWithLocalOrdering + 1);
strncpy(cstring, bytes, sizeWithLocalOrdering);
cstring[sizeWithLocalOrdering] = '\0';
NSString *resultString = [NSString stringWithCString:cstring encoding:NSUTF16StringEncoding];
methodResponseData =[resultString dataUsingEncoding:NSUTF16StringEncoding]; // methodResponseData is used later on in my parsing method
// Take care of the memory allocatoin, so that you can find the endoffile notification
free(cstring);
bytes += sizeWithLocalOrdering;
length -= sizeWithLocalOrdering;
}
Any help would be greatly appreciated.
I don't understand this: "This means any international characters in my string get messed up when I copy them into a cstring." If "sizeWithLocalOrdering" is correct for the actual length of the byte string, it seems like your original code should work (though I would have used memcpy rather than strncpy). If not, nothing's going to work.
Update: OK, I see it. Your original code was wrong here:
[NSString stringWithCString:cstring encoding:NSUTF16StringEncoding];
That should have been NSUTF8StringEncoding.
So it turns out I had a few interesting things happening that I was not expecting..
This is the code I used to get around working with the cstring and just take the bytes straight to a NSString as its original encoding then
NSString *tempstring = [[NSString alloc] initWithBytes:bytes length:sizeWithLocalOrdering encoding:NSUTF8StringEncoding];
methodResponseData =[tempstring dataUsingEncoding:NSUTF16StringEncoding]; // methodResponseData is used later on in my parsing method

iOS NSString in UTF16

I have a string that I fetched from an Apache server over HTTP:
- (void)connection:(NSURLConnection *)connection didReceiveData:(NSData *)data {
responseString = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
...
I need to make that string a UTF16 string. I don't want to turn it into NSData. I need to keep it NSString and I need it to be in UTF16.
I would be happy to put it in an NSData object even, if I could do it as UTF16. I'm doing something similar now:
[self.returnedData appendData:data];
But that still transfers it as UTF8.
It's probably simple and I'm missing it. But I don't find it in the Apple docs or this site, and my Google-Fu has failed me.
What am I missing? How do I do that?
Thanks for your time and help.
EDIT:
Ok. All of what you and Justin have said makes sense and makes things make more sense.
So this is what I am doing. It seems to be correct from this line but I wanted to make sure I am understanding you correctly.
NSData *resultData = [self. result dataUsingEncoding:NSUTF16LittleEndianStringEncoding];
NSString *resultStr = [[NSString alloc] initWithData:resultData encoding:NSUTF16LittleEndianStringEncoding];
NSString *md5Result = [[NSString stringWithFormat:#"%#",[resultStr MD5]] uppercaseString];
NSLog(#"md5Result = %#",md5Result);
That last part is what I am doing with the string after it's UTF-16. I have a category that makes it an MD5 hex string similar to http://blog.blackwhale.at/?tag=hmac
Thanks again. I'll bump you guys both and say this is the right answer.
A string is a string is a string. The encoding refers to how its encoded and decoded to and from NSData. #"blah" is the same as #"blah". There is no UTF8 or UTF 16 for either of those.
Added
So you can do [#"myString" dataUsingEncoding:NSUTF16StringEncoding];
If you convert that back to a string, you'll still have #"myString"
Answer last question in comment below.
So when you POST to a server the server body is encoded data. So what you wanted to do is do what ever you want to the string. THEN convert the string to data using a particular encoding, in your case, NSUTF16StringEncoding or NSUTF16LittleEndianStringEncoding. You are NOT creating UTF-16 string. You are converting a unicode string to UTF-16 encoded data. This is what you need to do then.
NSData *postBody = [[[self.result MD5] uppercaseString] dataUsingEncoding:NSUTF16LittleEndianStringEncoding];
If you need to add more data to the postBody create NSMutableData instead and append the new data as needed.
NSString holds a buffer of whatever encoding it chooses - that may be UTF-8, UTF-16, or something else.
If you just want to create an NSString from a UTF-16 sequence, try NSUTF16BigEndianStringEncoding or one of its relatives.

Resources