unichar* to NSString, get the length - ios

I am trying to create an NSString object from a const unichar buffer where I don't know the length of the buffer.
I want to use the NSString stringWithCharacters: length: method to create the string (this seems to work), but please can you help me find out the length?
I have:
const unichar *c_emAdd = [... returns successfully from a C++ function...]
NSString *emAdd = [NSString stringWithCharacters:c_emAdd length = unicharLen];
Can anyone help me find out how to check what unicharLen is? I don't get this length passed back to me by the call to the C++ function, so I presume I'd need to iterate until I find a terminating character? Anyone have a code snippet to help? Thanks!

Is your char buffer null terminated?
Is it 16-bit unicode?
NSString *emAdd = [NSString stringWithFormat:#"%S", c_emAdd];

Your unichars should be null terminated so you when you reach two null bytes (a unichar = 0x0000) in the pointer you will know the length.
unsigned long long unistrlen(unichar *chars)
{
unsigned long long length = 0llu;
if(NULL == chars) return length;
while(NULL != chars[length])
length++;
return length;
}
//...
//Inside Some method or function
unichar chars[] = { 0x005A, 0x0065, 0x0062, 0x0072, 0x0061, 0x0000 };
NSString *string = [NSString stringWithCharacters:chars length:unistrlen(chars)];
NSLog(#"%#", string);
Or even simpler format with %S specifier

Related

How to put unicode char into NSString

For example I could type an emoji character code such as:
NSString* str = #"😊";
NSLog(#"%#", str);
The smile emoji would be seen in the console.
Maybe the code editor and the compiler would trade the literal in UTF-8.
And now I'm working in a full unicode, I mean 32bit per char, environment and I've got the unicode of the emoji, I want to convert the 32bit unicode into a NSString for example:
int charcode = 0x0001F60A;
NSLog(#"%??", charcode);
The question is what should I put at the "??" position and then I could format the charcode into a emoji string?
BTW the charcode was a variable which can not be determine at the compile time.
I don't want to compress the 32bit int into UTF-8 bytes unless that would be the only way.
If 0x0001F60A is a dynamic value determined at runtime then
you can use the NSString method
- (instancetype)initWithBytes:(const void *)bytes length:(NSUInteger)len encoding:(NSStringEncoding)encoding;
to create a string containing a character with the given Unicode value:
int charcode = 0x0001F60A;
uint32_t data = OSSwapHostToLittleInt32(charcode); // Convert to little-endian
NSString *str = [[NSString alloc] initWithBytes:&data length:4 encoding:NSUTF32LittleEndianStringEncoding];
NSLog(#"%#", str); // 😊
Use NSString initialization method
int charcode = 0x0001F60A;
NSLog(#"%#", [[NSString alloc] initWithBytes:&charcode length:4 encoding:NSUTF32LittleEndianStringEncoding]);

How to find String Length Without [str length]?

How to find string length Without use length algorithm. Please any one suggest me. What type of algorithm used to find string length.
Already i know [str length];
Any other option is available or not? If available means tell me.
Thanks.
I hope this helps you
NSString *foo = #"IDontWantToUseStringLength";
const wchar_t *str = (const wchar_t*)[foo cStringUsingEncoding:NSUTF16StringEncoding];
int len = 0;
while (str[len] != '\0') {
len++;
}

Converting NSString to unichar in iOS

I have seen questions in stackoverflow that convert unichar to NSString but now I would like to do the reverse.
How do i do it?
Need some guidance.. Thanks
For example, I have an array of strings:[#"o",#"p",#"q"];
These are strings inside. How do i convert it back to unichar?
The following will work as long as the first character isn't actually two composed characters (in other words as long as the character doesn't have a Unicode value greater than \UFFFF):
unichar ch = [someString characterAtIndex:0];
You could convert it to a buffer in NSData:
if ([string canBeConvertedToEncoding:NSUnicodeStringEncoding]) {
NSData * data = [string dataUsingEncoding:NSUnicodeStringEncoding];
const unichar* const ptr = (const unichar*)data.bytes;
...
}

EXC_BAD_ACCESS error when using NSString getCString

I'm trying to parse some HTML. I use stringWithContentsOfURL to get the HTML. I attempt to load this into a character array so I can parse it, but I crash with the EXC_BAD_ACCESS error when getCString is called. Here is the relavent code:
- (void)parseStoryWithURL:(NSURL *)storyURL
{
_paragraphs = [[NSMutableArray alloc] initWithCapacity:10];
_read = NO;
NSError* error = nil;
NSString* originalFeed = [NSString stringWithContentsOfURL:storyURL encoding:NSUTF8StringEncoding error:&error];
_i = originalFeed.length;
char* entireFeed = malloc(_i*sizeof(char));
char* current = entireFeed;
char* lagger;
char* recentChars = malloc(7);
BOOL collectRecent = NO;
BOOL paragraphStarted = NO;
BOOL paragraphEnded = NO;
int recentIndex = 0;
int paragraphSize = 0;
NSLog(#"original Feed: %#", originalFeed);
_read = [originalFeed getCString:*entireFeed maxLength:_i encoding:NSUTF8StringEncoding];
I've also tried this passing the 'current' pointer to getCString but it behaves the same. From what I've read this error is typically thrown when you try to read from deallocated memory. I'm programming for iOS 5 with memory management. The line before that I print the HTML to the log and everything is fine. Help would be appreciated. I need to get past this error so I can test/debug my HTML parsing algorithms.
PS: If someone with enough reputation is allowed to, please add "getCString" as a tag. Apparently no one uses this function :(
There are several issues with your code - you're passing the wrong pointers and not reserving enough space. Probably the easiest is to use UTF8String instead:
char *entireFeed = strdup([originalFeed UTF8String]);
At the end you'll have to free the string with free(entireFeed) though. If you don't modify it you can use
const char *entireFeed = [originalFeed UTF8String];
directly.
If you want to use getCString, you'll need to determine the length first - which has to include the termination character as well as extra space for encoded characters, so something like:
NSUInteger len = [originalFeed lengthOfBytesUsingEncoding: NSUTF8StringEncoding] + 1;
char entireFeed[len];
[originalFeed getCString:entireFeed maxLength:len encoding:NSUTF8StringEncoding];
Try explicitly malloc'ing entireFeed with a length of _i (not 100% certain of this, as NSUTF8String might also include double byte unichars or wchars) instead of the wacky char * entireFeed[_i] thing you're doing.
I can't imagine char * entireFeed[_i] is working at run-time (and instead, you're passing a NULL pointer to your getCString method).
A few strange things;
char* entireFeed[_i]; allocates an array of char*, not an array of char. I suspect you want char entireFeed[_i] or char *entireFeed = malloc(_i*sizeof(char));
getCString takes a char* as a first parameter, that is, you should send it entireFeed instead of *entireFeed.
Also, note that the (UTF-8) encoding may add bytes to the result, so allocating the buffer the exact size of the input may cause the method to return NO (buffer too small). You should really use [originalFeed UTF8String] instead.

Find Character String In Binary Data

I have a binary file I've loaded using an NSData object. Is there a way to locate a sequence of characters, 'abcd' for example, within that binary data and return the offset without converting the entire file to a string? Seems like it should be a simple answer, but I'm not sure how to do it. Any ideas?
I'm doing this on iOS 3 so I don't have -rangeOfData:options:range: available.
I'm going to award this one to Sixteen Otto for suggesting strstr. I went and found the source code for the C function strstr and rewrote it to work on a fixed length Byte array--which incidentally is different from a char array as it is not null terminated. Here is the code I ended up with:
- (Byte*)offsetOfBytes:(Byte*)bytes inBuffer:(const Byte*)buffer ofLength:(int)len;
{
Byte *cp = bytes;
Byte *s1, *s2;
if ( !*buffer )
return bytes;
int i = 0;
for (i=0; i < len; ++i)
{
s1 = cp;
s2 = (Byte*)buffer;
while ( *s1 && *s2 && !(*s1-*s2) )
s1++, s2++;
if (!*s2)
return cp;
cp++;
}
return NULL;
}
This returns a pointer to the first occurrence of bytes, the thing I'm looking for, in buffer, the byte array that should contain bytes.
I call it like this:
// data is the NSData object
const Byte *bytes = [data bytes];
Byte* index = [self offsetOfBytes:tag inBuffer:bytes ofLength:[data length]];
Convert your substring to an NSData object, and search for those bytes in the larger NSData using rangeOfData:options:range:. Make sure that the string encodings match!
On iPhone, where that isn't available, you may have to do this yourself. The C function strstr() will give you a pointer to the first occurrence of a pattern within the buffer (as long as neither contain nulls!), but not the index. Here's a function that should do the job (but no promises, since I haven't tried actually running it...):
- (NSUInteger)indexOfData:(NSData*)needle inData:(NSData*)haystack
{
const void* needleBytes = [needle bytes];
const void* haystackBytes = [haystack bytes];
// walk the length of the buffer, looking for a byte that matches the start
// of the pattern; we can skip (|needle|-1) bytes at the end, since we can't
// have a match that's shorter than needle itself
for (NSUInteger i=0; i < [haystack length]-[needle length]+1; i++)
{
// walk needle's bytes while they still match the bytes of haystack
// starting at i; if we walk off the end of needle, we found a match
NSUInteger j=0;
while (j < [needle length] && needleBytes[j] == haystackBytes[i+j])
{
j++;
}
if (j == [needle length])
{
return i;
}
}
return NSNotFound;
}
This runs in something like O(nm), where n is the buffer length, and m is the size of the substring. It's written to work with NSData for two reasons: 1) that's what you seem to have in hand, and 2) those objects already encapsulate both the actual bytes, and the length of the buffer.
If you're using Snow Leopard, a convenient way is the new -rangeOfData:options:range: method in NSData that returns the range of the first occurrence of a piece of data. Otherwise, you can access the NSData's contents yourself using its -bytes method to perform your own search.
I had the same problem.
I solved it doing the other way round, compared to the suggestions.
first, I reformat the data (assume your NSData is stored in var rawFile) with:
NSString *ascii = [[NSString alloc] initWithData:rawFile encoding:NSAsciiStringEncoding];
Now, you can easily do string searches like 'abcd' or whatever you want using the NSScanner class and passing the ascii string to the scanner. Maybe this is not really efficient, but it works until the -rangeOfData method will be available for iPhone also.

Resources