Null included in string imported from rs232 - ios

I have a sting that is appended to a text view from a rs232 device. Everything imports but there is some of the strings that get concatinated as if there is a null value attached to end and start of some of the data that is imported. Any idea on how to look for a null value in a string?
- (void)readBytesAvailable:(NSInteger)count {
DataSource *sharedManager = [DataSource sharedManager];
const int bufferLength = 1024;
uint8_t buffer[bufferLength];
NSString *s;
NSInteger bytesRead = [session read:buffer bufferLength:bufferLength];
if (bytesRead > 0) {
// Convert to a string - note that the remote device is sending only well- formed UTF8 text data (e.g. no binary data, no VT100 excampe codes, etc).
s = [[NSString alloc] initWithBytes:buffer length:bytesRead encoding: NSUTF8StringEncoding];
AllDataTextView.text = [AllDataTextView.text stringByAppendingString:s];

The way you are parsing the strings is wrong.
If you are reading bytes in chunks of 1024 bytes then you can easily read up only a part of some character (UTF8 characters can have length of 1, 2, 3 or even more bytes). Then when trying to parse such a chunk, you will get nil (null) because your buffer won't contain valid data (a partial character is invalid and it will make parsing fail).
The ideal solution would be to read all the bytes first and then convert them to text. Another solution is to detect partial characters (see UTF8 specification) and parse only the characters that are whole and keep the partial characters in the buffer until next iteration.
See Decoding partial UTF-8 into NSString

Related

Bytes are changed after encoding NSString into NSInputStream via NSData

I run into the following problem when trying encoding an NSString as NSString -> NSData -> NSInputStream and then decode from NSInputStream with read method:
NSString *inputString = [NSString stringWithFormat:#"%c", 255];
NSData *data = [inputString dataUsingEncoding:NSUTF8StringEncoding];
NSInputStream *stream = [NSInputStream inputStreamWithData:data];
[stream open];
uint8_t bytes;
[stream read:&bytes maxLength:1];
NSLog(#"%i", bytes);
The output is 195 instead of 255. Why?
Because of the kind of encoding you used for the string. UTF-8 is a form of string encoding which will end up converting characters with values above 127 into multi-byte sequences. So although inputString contained a single character, your data object didn't actually contain a single byte as you may have assumed, but multiple (two, in this case) bytes. And when you read from the stream, you only read the first byte of the encoded data, but there was more there.
You didn't need to run the data through the input stream to see this result. Accessing the first byte of the NSData instance would have shown the same thing.
You say that this is a "problem" but you don't suggest what you're trying to actually accomplish. 255 isn't a printable/meaningful text character. If you want to transmit raw data bytes, you can do that directly, rather than using an NSString and string encodings. If you are transmitting strings, then it's already doing the right thing. You just need to be prepared that your data size can exceed your string "length".

Obfuscating a number(in a string) Objective C

I'm using the following code to obfuscate a passcode for a test app of mine.
- (NSString *)obfuscate:(NSString *)string withKey:(NSString *)key
{
// Create data object from the string
NSData *data = [string dataUsingEncoding:NSUTF8StringEncoding];
// Get pointer to data to obfuscate
char *dataPtr = (char *) [data bytes];
// Get pointer to key data
char *keyData = (char *) [[key dataUsingEncoding:NSUTF8StringEncoding] bytes];
// Points to each char in sequence in the key
char *keyPtr = keyData;
int keyIndex = 0;
// For each character in data, xor with current value in key
for (int x = 0; x < [data length]; x++)
{
// Replace current character in data with
// current character xor'd with current key value.
// Bump each pointer to the next character
*dataPtr = *dataPtr++ ^ *keyPtr++;
// If at end of key data, reset count and
// set key pointer back to start of key value
if (++keyIndex == [key length])
keyIndex = 0, keyPtr = keyData;
}
return [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
}
This works like a charm with all strings, but i've ran into a bit of a problem comparing the following results
NSLog([[self obfuscate:#"0000", #"maki"]); //Returns 0]<W
NSLog([[self obfuscate:#"0809", #"maki"]); //Returns 0]<W
As you can see, the two strings with numbers in, while different, return the same result! Whats gone wrong in the code i've attached to result in the same result for these two numbers?
Another example:
NSLog([self obfuscate:#"8000" withKey:#"maki"]); //Returns 8U4_
NSLog([self obfuscate:#"8290" withKey:#"maki"]); //Returns 8U4_ as well
I may be misunderstanding the concept of obfuscation, but I was under the impression that each unique string returns a unique obfuscated string!
Please help me fix this bug/glitch
Source of Code: http://iosdevelopertips.com/cocoa/obfuscation-encryption-of-string-nsstring.html
The problem is your last line. You create the new string with the original, unmodified data object.
You need to create a new NSData object from the modified dataPtr bytes.
NSData *newData = [NSData dataWithBytes:dataPtr length:data.length];
return [[NSString alloc] initWithData:newData encoding:NSUTF8StringEncoding];
But you have some bigger issues.
The calls to bytes returns a constant, read-only reference to the bytes in the NSData object. You should NOT be modifying that data.
The result of your XOR on the character data could, in theory, result in a byte stream that is no longer a valid UTF-8 encoded string.
The obfuscation algorithm that you have selected is based on XORing the data and the "key" values together. Generally, this is not very strong. Moreover, since XOR is symmetric, the results are very prone to producing duplicates.
Although your implementation is currently broken, fixing it would not be of much help in preventing the algorithm from producing identical results for different data: it is relatively straightforward to construct key/data pairs that produce the same obfuscated string - for example,
[self obfuscate:#"0123" withKey:#"vwxy"]
[self obfuscate:#"pqrs" withKey:#"6789"]
will produce identical results "FFJJ", even though both the strings and the keys look sufficiently different.
If you would like to "obfuscate" your strings in a cryptographically strong way, use a salted secure hash algorithm: it will produce very different results for even slightly different strings.

NSString's method 'getCString' return false for Arabic text

I am generating QR code and everything is working fine if text is only in English. When i want to generate QR code with some Arabic text then it fails at NSString's method "getCString:maxLength:encoding:".
Suppose, I have two strings:
NSString *englishText = #"Some text English";
NSString *englishArabicMixText = #"Some text بالعربي";
char strEng [[englishText length] + 1];
char strArb [[englishArabicMixText length] + 1];
1- [englishText getCString:strEng maxLength:[englishText length] + 1 encoding:NSUTF8StringEncoding];
2- [englishArabicMixText getCString:strArb maxLength:[englishArabicMixText length] + 1 encoding:NSUTF8StringEncoding];
At Case#1 'getCString' return true and QR code is generated and at Case#2 it return false and failed to generate code.
What should I do, so that in case#2 it should also return true ? Thank you
length returns the number of Unicode characters. You have to use lengthOfBytesUsingEncoding:, which returns the number of bytes required to store the receiver in a given encoding.
NSUInteger arbLength = [englishArabicMixText lengthOfBytesUsingEncoding:NSUTF8StringEncoding] + 1;
char strArb [arbLength];
[englishArabicMixText getCString:strArb maxLength:arbLength encoding:NSUTF8StringEncoding];
2 is returning false for either 2 possible reasons:
1) the string cannot be converted with the specified encoding.
2) the buffer to hold the encoded string is too small.
I'd guess (or at least I suggest you to start investigating) problem is nr. 2.
Because as you're converting to UTF8 a single un-encoded character may result in more than one encoded character. An 'A' is a single byte with value 65 but an arabic character or some kind of symbol may require more bytes.
You are assuming your destination buffer requires the same number of bytes as the same number of characters of your NSString
So you should do something like that:
NSUInteger size = [englishArabicMixText lengthOfBytesUsingEncoding : NSUTF8StringEncoding];
if(size>0)
{
size++;
strArb = malloc(size); // NOTE: you should allocate space for your string at runtime!!
[englishArabicMixText getCString:strArb maxLength:size encoding:NSUTF8StringEncoding];
}
You should do the same for the plain english string too.
And I'd reccomend to allocate dinamically at runtime the space for the C string with malloc and then free it when you don't need it anymore.

how to save bytes to an NSString with UTF8 encoding

I have some NSData that I am passing in as bytes
const void *bytes = [responseData bytes];
Those bytes were originally UTF8 formatted, I am now trying to get them into a UTF8 NSString without messing with the encoding at all.
I have previously written this if that copies the bytes into a cstring which normally would be fine unless I have any non english characters in the bytes which take two byte instead of one. This means any international characters in my string get messed up when I copy them into a cstring.
Hence the reason for needing to copying the bytes directly into a UTF8 formatted object.. preferably a NSString.. if possible.
This is how I was handling the conversion which I later found out is wrong but will hopefully give you a good idea of what I am trying to achieve.
else if (typeWithLocalOrdering == METHOD_RESPONSE)
{
cstring = (char *) malloc(sizeWithLocalOrdering + 1);
strncpy(cstring, bytes, sizeWithLocalOrdering);
cstring[sizeWithLocalOrdering] = '\0';
NSString *resultString = [NSString stringWithCString:cstring encoding:NSUTF16StringEncoding];
methodResponseData =[resultString dataUsingEncoding:NSUTF16StringEncoding]; // methodResponseData is used later on in my parsing method
// Take care of the memory allocatoin, so that you can find the endoffile notification
free(cstring);
bytes += sizeWithLocalOrdering;
length -= sizeWithLocalOrdering;
}
Any help would be greatly appreciated.
I don't understand this: "This means any international characters in my string get messed up when I copy them into a cstring." If "sizeWithLocalOrdering" is correct for the actual length of the byte string, it seems like your original code should work (though I would have used memcpy rather than strncpy). If not, nothing's going to work.
Update: OK, I see it. Your original code was wrong here:
[NSString stringWithCString:cstring encoding:NSUTF16StringEncoding];
That should have been NSUTF8StringEncoding.
So it turns out I had a few interesting things happening that I was not expecting..
This is the code I used to get around working with the cstring and just take the bytes straight to a NSString as its original encoding then
NSString *tempstring = [[NSString alloc] initWithBytes:bytes length:sizeWithLocalOrdering encoding:NSUTF8StringEncoding];
methodResponseData =[tempstring dataUsingEncoding:NSUTF16StringEncoding]; // methodResponseData is used later on in my parsing method

iOS - XML to NSString conversion

I'm using NSXMLParser for parsing XML to my app and having a problem with the encoding type. For example, here is one of the feeds coming in. It looks similar to this"
\U2026Some random text from the xml feed\U2026
I am currently using the encoding type:
NSData *data = [string dataUsingEncoding:NSUTF8StringEncoding];
Which encoding type am I suppose to use for converting \U2026 into a ellipse (...) ??
The answer here is you're screwed. They are using a non-standard encoding for XML, but what if they really want the literal \U2026? Let's say you add a decoder to handle all \UXXXX and \uXXXX encodings. What happens when another feed want the data to be the literal \U2026?
You're first choice and best bet is to get this feed fixed. If they need to encode data, they need to use proper HTML entities or numeric references.
As a fallback, I would isolate the decoder away from the XML parser. Don't create a non-conforming XML parser just because your getting non-conforming data. Have a post processor that would only be run on the offending feed.
If you must have a decoder, then there is more bad news. There is no built in decoder, you will need to find a category online or write one up yourself.
After some poking around, I think Using Objective C/Cocoa to unescape unicode characters, ie \u1234 may work for you.
Alright, heres a snippet of code that should work for any unicode code-point:
NSString *stringByUnescapingUnicodeSymbols(NSString *input)
{
NSMutableString *output = [NSMutableString stringWithCapacity:[input length]];
// get the UTF8 string for this string...
const char *UTF8Str = [input UTF8String];
while (*UTF8Str) {
if (*UTF8Str == '\\' && tolower(*(UTF8Str + 1)) == 'u')
{
// skip the next 2 chars '\' and 'u'
UTF8Str += 2;
// make sure we only read 4 chars
char tmp[5] = { UTF8Str[0], UTF8Str[1], UTF8Str[2], UTF8Str[3], 0 };
long unicode = strtol(tmp, NULL, 16); // remember that Unicode is base 16
[output appendFormat:#"%C", unicode];
// move on with the string (making sure we dont miss the end of the string
for (int i = 0; i < 4; i++) {
if (*UTF8Str == 0)
break;
UTF8Str++;
}
}
else
{
if (*UTF8Str == 0)
break;
[output appendFormat:#"%c", *UTF8Str];
}
UTF8Str++;
}
return output;
}
You should simple replace literal '\U2026' on a quotation, then encode it with NSUTF8StringEncoding encodind to NSData

Resources