I am allowing the user to input some data into the TextField. The user inputs Š1234D into the TextField.
The code I have looks like this:
NSString *string = textField.text;
for (int nCtr = 0; nCtr < [string length]; nCtr++) {
const char chars = [string characterAtIndex:nCtr];
int isAlpha = isalpha(chars);
}
string output looks like this:Š1234D
Then I printed the first chars value, it looks like this:'`' instead of 'Š'. Why is this so? I would like to allow special characters in my code as well.
Any suggestion would be welcome as well. Need some guidance. Thanks
You are truncating the character value as [NSString chatacterAtIndex:] returns unichar (16-bit) and not char (8-bit). try:
unichar chars = [string characterAtIndex:nCtr];
UPDATE: Also note that you shouldn't be using isalpha() to test for letters, as that is restricted to Latin character sets and you need something that can cope with non-latin characters. Use this code instead:
NSCharacterSet *letterSet = [NSCharacterSet letterCharacterSet];
NSString *string = textField.text;
for (NSUIntger nCtr = 0; nCtr < [string length]; nCtr++)
{
const unichar c = [string characterAtIndex:nCtr];
BOOL isAlpha = [letterSet characterIsMember:c];
...
}
characterAtIndex: returns a unichar (2-byte Unicode character), not char (1-byte ASCII character). By casting it to char, you are getting only one of the two bytes.
You should turn on your compiler warnings. I believe "Suspicious implicit conversions" should do the trick.
On a separate note, you can't use isAlpha(char) with a unichar. Use [[NSCharacterSet letterCharacterSet] characterIsMember:chars]
Related
I need to substring char* to some length and need to convert to NSString.
char *val substring Length
I tried
NSString *tempString = [NSString stringWithCString:val encoding:NSAsciiStringEncoding];
NSRange range = NSMakeRange (0, length);
NSString *finalValue = [tempString substringWithRange: range];
This works but not for other special character languages like chinese.
If i convert To UTF8Encoding then substring length will mismatch.
Is there any other way to substring the char* then convert to UTF8 encoding?
You have to use the encoding, the string is encoded in.
In your case, you say to interpret the string as ASCII string. ASCII does not have chinese characters. Therefore this cannot work with chinese characters: They are not there.
Likely you have an UTF8 encoded string. But simply switching to UTF8 does not help. Since NSString and OS X/iOS at all encodes 16-Bit Unicode, but extended Unicode has 20 bits, chinese characters needs multiple codes. This has some effects, for example -length returns the number of codes, not the number of chinese characters. However, with -rangeOfComposedCharacterSequencesForRange: you can adjust the range.
For example 𠀖 (CJK unified ideograph-0x20016):
NSString *str = #"𠀖"; // One chinese whatever
NSLog(#"%ld", [str length]); // This are "2" characters
NSRange range = {0, 1}; // Range for the "first" character
NSLog(#"%ld %ld", range.location, range.length); // 0 1
range = [str rangeOfComposedCharacterSequencesForRange:range];
NSLog(#"%ld %ld", range.location, range.length); // 0 2
You can get a better answer, if you add information about the encoding of the string coming in and the required encoding for putting out.
Strings are not UTF8 or whatever strings. Strings are strings. Their storage, their representation in computer memory has an encoding, but they don't have an encoding themselves.
I found the solution for my question
char subString[length+1];
strncpy(subString, val, length);
subString[length] = '\0'; // place the null terminator
NSString *finalString = [NSString stringWithCString: subString encoding:NSUTF8StringEncoding];
I did the char* sub string and UTF8 encoding both.
Now I have a range of unicode numbers, I want to show them in UILabel, I can show them if i hardcode them, but that's too slow, so I want to substitute them with a variable, and then change the variable and get the relevant character.
For example, now I know the unicode is U+095F, I want to show the range of U+095F to U+096f in UILabel, I can do that with hardcode like
NSString *str = [NSString stringWithFormat:#"\u095f"];
but I want to do that like
NSInteger hex = 0x095f;
[NSString stringWithFormat:#"\u%ld", (long)hex];
I can change the hex automatically,just like using #"%ld", (long)hex, so anybody know how to implement that?
You can initialize the string with the a buffer of bytes of the hex (you simply provide its pointer). The point is, and the important thing to notice is that you provide the character encoding to be applied. Specifically you should notice the byte order.
Here's an example:
UInt32 hex = 0x095f;
NSString *unicodeString = [[NSString alloc] initWithBytes:&hex length:sizeof(hex) encoding:NSUTF32LittleEndianStringEncoding];
Note that solutions like using the %C format are fine as long as you use them for 16-bit unicode characters; 32-bit unicode characters like emojis (for example: 0x1f601, 0x1f41a) will not work using simple formatting.
You would have to use
[NSString stringWithFormat:#"%C", (unichar)hex];
or directly declare the unichar (unsigned short) as
unichar uni = 0x095f;
[NSString stringWithFormat:#"%C", uni];
A useful resource might be the String Format Specifiers, which lists %C as
16-bit Unicode character (unichar), printed by NSLog() as an ASCII character, or, if not an ASCII character, in the octal format \ddd or the Unicode hexadecimal format \udddd, where d is a digit.
Like this:
unichar charCode = 0x095f;
NSString *s = [NSString stringWithFormat:#"%C",charCode];
NSLog(#"String = %#",s); //Output:String = य़
I'm wondering, is there a way to detect a character that takes up more than 1 index spot in an NSString? (like an emoji). I'm trying to implement a custom text view and when the user pushes delete, I need to know if I should delete only the previous one index spot or more.
Actually NSString use UTF-16.So it is quite difficult to work with characters which takes two UTF-16 charater(unichar) or more.But you can do with rangeOfComposedCharacterSequenceAtIndexto get range and than delete.
First find the last character index from string
NSUInteger lastCharIndex = [str length] - 1;
Than get the range of last character
NSRange lastCharRange = [str rangeOfComposedCharacterSequenceAtIndex: lastCharIndex];
Than delete with range from character (If it is of two UTF-16 than it deletes UTF-16)
deletedLastCharString = [str substringToIndex: lastCharRange.location];
You can use this method with any type of characters which takes any number of unichar
For one you could transform the string to a sequence of characters using [myString UTF8String] and you can then check if the character has its first bit set to one or zero. If its one then this is a UTF8 character and you can then check how many bytes are there to this character. Details about UTF8 can be found on Wikipedia - UTF8. Here is a simple example:
NSString *string = #"ČTest";
const char *str = [string UTF8String];
NSMutableString *ASCIIStr = [NSMutableString string];
for (int i = 0; i < strlen(str); ++i)
if (!(str[i] & 128))
[ASCIIStr appendFormat:#"%c", str[i]];
NSLog(#"%#", ASCIIStr); //Should contain only ASCII characters
Im struggling to covert chinese word/characters to ascii or hexadecimal and all the values I've got up until now is not what I was suppose to get.
Example of conversion is the word 手 to hex is 1534b.
Methods Ive followed till now are as below, and I got varieties of results but the one I was looking for,
I really appreciate if you can help me out on this issue,
Thanks,
Mike
- (NSString *) stringToHex:(NSString *)str{
NSUInteger len = [str length];
unichar *chars = malloc(len * sizeof(unichar));
[str getCharacters:chars];
NSMutableString *hexString = [[NSMutableString alloc] init];
for(NSUInteger i = 0; i < len; i++ )
{
[hexString appendFormat:#"%02x", chars[i]]; //EDITED PER COMMENT BELOW
}
free(chars);
return hexString;}
and
const char *cString = [#"手" cStringUsingEncoding:NSASCIIStringEncoding];
below is the similar code in Java for Android, Maybe it helps
public boolean sendText(INotifiableManager manager, String text) {
final int codeOffset = 0xf100;
for (char c : text.toCharArray()) {
int code = (int)c+codeOffset;
if (! mConnection.getBoolean(manager, "SendKey", Integer.toString(code))) {
}
Your Java code is just doing this:
Take each 16-bit character of the string and add 0xf100 to it.
If you do the same thing in your above Objective-C code you will get the result you want.
I have seen questions in stackoverflow that convert unichar to NSString but now I would like to do the reverse.
How do i do it?
Need some guidance.. Thanks
For example, I have an array of strings:[#"o",#"p",#"q"];
These are strings inside. How do i convert it back to unichar?
The following will work as long as the first character isn't actually two composed characters (in other words as long as the character doesn't have a Unicode value greater than \UFFFF):
unichar ch = [someString characterAtIndex:0];
You could convert it to a buffer in NSData:
if ([string canBeConvertedToEncoding:NSUnicodeStringEncoding]) {
NSData * data = [string dataUsingEncoding:NSUnicodeStringEncoding];
const unichar* const ptr = (const unichar*)data.bytes;
...
}