Substring char * in Objective C - ios

I need to substring char* to some length and need to convert to NSString.
char *val substring Length
I tried
NSString *tempString = [NSString stringWithCString:val encoding:NSAsciiStringEncoding];
NSRange range = NSMakeRange (0, length);
NSString *finalValue = [tempString substringWithRange: range];
This works but not for other special character languages like chinese.
If i convert To UTF8Encoding then substring length will mismatch.
Is there any other way to substring the char* then convert to UTF8 encoding?

You have to use the encoding, the string is encoded in.
In your case, you say to interpret the string as ASCII string. ASCII does not have chinese characters. Therefore this cannot work with chinese characters: They are not there.
Likely you have an UTF8 encoded string. But simply switching to UTF8 does not help. Since NSString and OS X/iOS at all encodes 16-Bit Unicode, but extended Unicode has 20 bits, chinese characters needs multiple codes. This has some effects, for example -length returns the number of codes, not the number of chinese characters. However, with -rangeOfComposedCharacterSequencesForRange: you can adjust the range.
For example ๐ €– (CJK unified ideograph-0x20016):
NSString *str = #"๐ €–"; // One chinese whatever
NSLog(#"%ld", [str length]); // This are "2" characters
NSRange range = {0, 1}; // Range for the "first" character
NSLog(#"%ld %ld", range.location, range.length); // 0 1
range = [str rangeOfComposedCharacterSequencesForRange:range];
NSLog(#"%ld %ld", range.location, range.length); // 0 2
You can get a better answer, if you add information about the encoding of the string coming in and the required encoding for putting out.
Strings are not UTF8 or whatever strings. Strings are strings. Their storage, their representation in computer memory has an encoding, but they don't have an encoding themselves.

I found the solution for my question
char subString[length+1];
strncpy(subString, val, length);
subString[length] = '\0'; // place the null terminator
NSString *finalString = [NSString stringWithCString: subString encoding:NSUTF8StringEncoding];
I did the char* sub string and UTF8 encoding both.

Related

Convert Hex String to ASCII Format [duplicate]

This question already has an answer here:
NSString containing hex convert to ascii equivalent
(1 answer)
Closed 6 years ago.
I have a Hex string like "000000000100" and I am using the following logic to do ASCII conversion, the output I am receiving is only 1 byte (\x01) But I want the output in the 6 byte format as \x00\x00\x00\x00\x01\x00
-(NSString*) decode
{
string=#"000000000100";
NSMutableString * newString = [[NSMutableString alloc]init];
int i = 0;
while (i < [string length])
{
NSString * hexChar = [string substringWithRange: NSMakeRange(i, 2)];
int value = 0;
sscanf([hexChar cStringUsingEncoding:NSASCIIStringEncoding], "%x", &value);
[newString appendFormat:#"%c", (char)value];
i+=2;
}
return newString;
}
How to do that ?
Let's first directly address your bug: In your code you attempt to add the next byte to your string with:
[newString appendFormat:#"%c", (char)value];
Your problem is that %c produces nothing if the character is a null, so you are appending an empty string and as you found end up with a string with a single byte in it.
You can fix your code by testing for the null and appending a string containing a single null:
if (value == 0)
[newString appendString:#"\0"]; // append a single null
else
[newString appendFormat:#"%c", (char)value];
Second, is this the way to do this?
Other answers have shown you other algorithms, they might be more efficient than yours as they only convert to a C-String once rather than repeatedly extract substrings and convert each one individually.
If and only if performance is a real issue for you you might wish to consider such C-based solutions. You clearly know how to use scanf, but in such a simple case as this you might want to look at digittoint and do the conversion of two hex digits to an integer yourself (value of first * 16 + value of second).
Conversely if you'd like to avoid C and scanf look at NSScanner and scanHexInt/scanHexLongLong - if your strings are never longer than 16 hex digits you can convert the whole string in one go and then produce an NSString from the bytes of the resultant unsigned 64-bit integer.
HTH

How to convert unicode hex number variable to character in NSString?

Now I have a range of unicode numbers, I want to show them in UILabel, I can show them if i hardcode them, but that's too slow, so I want to substitute them with a variable, and then change the variable and get the relevant character.
For example, now I know the unicode is U+095F, I want to show the range of U+095F to U+096f in UILabel, I can do that with hardcode like
NSString *str = [NSString stringWithFormat:#"\u095f"];
but I want to do that like
NSInteger hex = 0x095f;
[NSString stringWithFormat:#"\u%ld", (long)hex];
I can change the hex automatically,just like using #"%ld", (long)hex, so anybody know how to implement that?
You can initialize the string with the a buffer of bytes of the hex (you simply provide its pointer). The point is, and the important thing to notice is that you provide the character encoding to be applied. Specifically you should notice the byte order.
Here's an example:
UInt32 hex = 0x095f;
NSString *unicodeString = [[NSString alloc] initWithBytes:&hex length:sizeof(hex) encoding:NSUTF32LittleEndianStringEncoding];
Note that solutions like using the %C format are fine as long as you use them for 16-bit unicode characters; 32-bit unicode characters like emojis (for example: 0x1f601, 0x1f41a) will not work using simple formatting.
You would have to use
[NSString stringWithFormat:#"%C", (unichar)hex];
or directly declare the unichar (unsigned short) as
unichar uni = 0x095f;
[NSString stringWithFormat:#"%C", uni];
A useful resource might be the String Format Specifiers, which lists %C as
16-bit Unicode character (unichar), printed by NSLog() as an ASCII character, or, if not an ASCII character, in the octal format \ddd or the Unicode hexadecimal format \udddd, where d is a digit.
Like this:
unichar charCode = 0x095f;
NSString *s = [NSString stringWithFormat:#"%C",charCode];
NSLog(#"String = %#",s); //Output:String = เฅŸ

How to put unicode char into NSString

For example I could type an emoji character code such as:
NSString* str = #"๐Ÿ˜Š";
NSLog(#"%#", str);
The smile emoji would be seen in the console.
Maybe the code editor and the compiler would trade the literal in UTF-8.
And now I'm working in a full unicode, I mean 32bit per char, environment and I've got the unicode of the emoji, I want to convert the 32bit unicode into a NSString for example:
int charcode = 0x0001F60A;
NSLog(#"%??", charcode);
The question is what should I put at the "??" position and then I could format the charcode into a emoji string?
BTW the charcode was a variable which can not be determine at the compile time.
I don't want to compress the 32bit int into UTF-8 bytes unless that would be the only way.
If 0x0001F60A is a dynamic value determined at runtime then
you can use the NSString method
- (instancetype)initWithBytes:(const void *)bytes length:(NSUInteger)len encoding:(NSStringEncoding)encoding;
to create a string containing a character with the given Unicode value:
int charcode = 0x0001F60A;
uint32_t data = OSSwapHostToLittleInt32(charcode); // Convert to little-endian
NSString *str = [[NSString alloc] initWithBytes:&data length:4 encoding:NSUTF32LittleEndianStringEncoding];
NSLog(#"%#", str); // ๐Ÿ˜Š
Use NSString initialization method
int charcode = 0x0001F60A;
NSLog(#"%#", [[NSString alloc] initWithBytes:&charcode length:4 encoding:NSUTF32LittleEndianStringEncoding]);

Way to detect character that takes up more than one index spot in an NSString?

I'm wondering, is there a way to detect a character that takes up more than 1 index spot in an NSString? (like an emoji). I'm trying to implement a custom text view and when the user pushes delete, I need to know if I should delete only the previous one index spot or more.
Actually NSString use UTF-16.So it is quite difficult to work with characters which takes two UTF-16 charater(unichar) or more.But you can do with rangeOfComposedCharacterSequenceAtIndexto get range and than delete.
First find the last character index from string
NSUInteger lastCharIndex = [str length] - 1;
Than get the range of last character
NSRange lastCharRange = [str rangeOfComposedCharacterSequenceAtIndex: lastCharIndex];
Than delete with range from character (If it is of two UTF-16 than it deletes UTF-16)
deletedLastCharString = [str substringToIndex: lastCharRange.location];
You can use this method with any type of characters which takes any number of unichar
For one you could transform the string to a sequence of characters using [myString UTF8String] and you can then check if the character has its first bit set to one or zero. If its one then this is a UTF8 character and you can then check how many bytes are there to this character. Details about UTF8 can be found on Wikipedia - UTF8. Here is a simple example:
NSString *string = #"ฤŒTest";
const char *str = [string UTF8String];
NSMutableString *ASCIIStr = [NSMutableString string];
for (int i = 0; i < strlen(str); ++i)
if (!(str[i] & 128))
[ASCIIStr appendFormat:#"%c", str[i]];
NSLog(#"%#", ASCIIStr); //Should contain only ASCII characters

NSString's method 'getCString' return false for Arabic text

I am generating QR code and everything is working fine if text is only in English. When i want to generate QR code with some Arabic text then it fails at NSString's method "getCString:maxLength:encoding:".
Suppose, I have two strings:
NSString *englishText = #"Some text English";
NSString *englishArabicMixText = #"Some text ุจุงู„ุนุฑุจูŠ";
char strEng [[englishText length] + 1];
char strArb [[englishArabicMixText length] + 1];
1- [englishText getCString:strEng maxLength:[englishText length] + 1 encoding:NSUTF8StringEncoding];
2- [englishArabicMixText getCString:strArb maxLength:[englishArabicMixText length] + 1 encoding:NSUTF8StringEncoding];
At Case#1 'getCString' return true and QR code is generated and at Case#2 it return false and failed to generate code.
What should I do, so that in case#2 it should also return true ? Thank you
length returns the number of Unicode characters. You have to use lengthOfBytesUsingEncoding:, which returns the number of bytes required to store the receiver in a given encoding.
NSUInteger arbLength = [englishArabicMixText lengthOfBytesUsingEncoding:NSUTF8StringEncoding] + 1;
char strArb [arbLength];
[englishArabicMixText getCString:strArb maxLength:arbLength encoding:NSUTF8StringEncoding];
2 is returning false for either 2 possible reasons:
1) the string cannot be converted with the specified encoding.
2) the buffer to hold the encoded string is too small.
I'd guess (or at least I suggest you to start investigating) problem is nr. 2.
Because as you're converting to UTF8 a single un-encoded character may result in more than one encoded character. An 'A' is a single byte with value 65 but an arabic character or some kind of symbol may require more bytes.
You are assuming your destination buffer requires the same number of bytes as the same number of characters of your NSString
So you should do something like that:
NSUInteger size = [englishArabicMixText lengthOfBytesUsingEncoding : NSUTF8StringEncoding];
if(size>0)
{
size++;
strArb = malloc(size); // NOTE: you should allocate space for your string at runtime!!
[englishArabicMixText getCString:strArb maxLength:size encoding:NSUTF8StringEncoding];
}
You should do the same for the plain english string too.
And I'd reccomend to allocate dinamically at runtime the space for the C string with malloc and then free it when you don't need it anymore.

Resources