This question already has an answer here:
NSString containing hex convert to ascii equivalent
(1 answer)
Closed 6 years ago.
I have a Hex string like "000000000100" and I am using the following logic to do ASCII conversion, the output I am receiving is only 1 byte (\x01) But I want the output in the 6 byte format as \x00\x00\x00\x00\x01\x00
-(NSString*) decode
{
string=#"000000000100";
NSMutableString * newString = [[NSMutableString alloc]init];
int i = 0;
while (i < [string length])
{
NSString * hexChar = [string substringWithRange: NSMakeRange(i, 2)];
int value = 0;
sscanf([hexChar cStringUsingEncoding:NSASCIIStringEncoding], "%x", &value);
[newString appendFormat:#"%c", (char)value];
i+=2;
}
return newString;
}
How to do that ?
Let's first directly address your bug: In your code you attempt to add the next byte to your string with:
[newString appendFormat:#"%c", (char)value];
Your problem is that %c produces nothing if the character is a null, so you are appending an empty string and as you found end up with a string with a single byte in it.
You can fix your code by testing for the null and appending a string containing a single null:
if (value == 0)
[newString appendString:#"\0"]; // append a single null
else
[newString appendFormat:#"%c", (char)value];
Second, is this the way to do this?
Other answers have shown you other algorithms, they might be more efficient than yours as they only convert to a C-String once rather than repeatedly extract substrings and convert each one individually.
If and only if performance is a real issue for you you might wish to consider such C-based solutions. You clearly know how to use scanf, but in such a simple case as this you might want to look at digittoint and do the conversion of two hex digits to an integer yourself (value of first * 16 + value of second).
Conversely if you'd like to avoid C and scanf look at NSScanner and scanHexInt/scanHexLongLong - if your strings are never longer than 16 hex digits you can convert the whole string in one go and then produce an NSString from the bytes of the resultant unsigned 64-bit integer.
HTH
Related
I'm using an old objectiveC routine (let's call it oldObjectiveCFunction), which parses a String analyzing each char. After analyzing chars, it divides that String into Strings, and returns them into an array called *functions. This is a super reduced sample of how is that old function doing the String parse:
NSMutableArray *functions = [NSMutableArray new];
NSMutableArray *components = [NSMutableArray new];
NSMutableString *sb = [NSMutableString new];
char c;
int sourceLen = source.length;
int index = 0;
while (index < sourceLen) {
c = [source characterAtIndex:index];
//here do some random work analyzing the char
[sb appendString:[NSString stringWithFormat:#"%c",c]];
if (some condition){
[components addObject:(NSString *)sb];
sb = [NSMutableString new];
[functions addObject:[components copy]];
}
}
later, I'm getting each String of *functions doing this with Swift code:
let functions = oldObjectiveCFunction(string) as? [[String]]
functions?.forEach({ (function) in
var functionCopy = function.map { $0 }
for index in 0..<functionCopy.count {
let string = functionCopy[index]
}
}
the problem is that, it works perfectly with normal strings, but if the String contains russian names, like this:
РАЦИОН
the output, the content of my let string variable, is this:
\u{10}&\u{18}\u{1e}\u{1d}
How can I get the same Russian string instead of that?
I tried doing this:
let string2 = String(describing: string?.cString(using: String.Encoding.utf8))
but it returns even more strange result:
"Optional([32, 16, 38, 24, 30, 29, 0])"
Analysis. Sorry, I don't speak swift or Objective-C so the following example is given in Python; however, the 4th and 5th column (unicode reduced to 8-bit) recalls weird numbers in your question.
for ch in 'РАЦИОН':
print(ch, # character itself
ord(ch), # character unicode in decimal
'{:04x}'.format(ord(ch)), # character unicode in hexadecimal
(ord(ch)&0xFF), # unicode reduced to 8-bit decimal
'{:02x}'.format(ord(ch)&0xFF)) # unicode reduced to 8-bit hexadecimal
Р 1056 0420 32 20
А 1040 0410 16 10
Ц 1062 0426 38 26
И 1048 0418 24 18
О 1054 041e 30 1e
Н 1053 041d 29 1d
Solution. Hence, you need to fix all in your code reducing 16-bit to to 8-bit:
first, declare unichar c; instead of char c; at the 4th line, and use [sb appendString:[NSString stringWithFormat:#"%C",c]]; at the 11th line; note
Latin Capital Letter C in %C specifier 16-bit UTF-16 code unit (unichar) instead of
Latin Small Letter C in %c specifier 8-bit unsigned character (unsigned char);
Resources. My answer is based on answers to the following questions at SO:
What are the supported Swift String format specifiers?
objective-c - difference between char and unichar?
Your last result is not strange. The optional comes from the string?, and the cString() function returns an array of CChar ( Int8 ).
I think the problem comes from here - but I'm not sure because the whole thing looks confusing:
[sb appendString:[NSString stringWithFormat:#"%c",c]];
have you tried :
[sb appendString: [NSString stringWithCString:c encoding:NSUTF8StringEncoding]];
Instead of stringWithFormat?
( The solution of the %C instead of %c proposed by your commenters looks a good idea too. ) - oops - just saw you have tried without success.
I need to substring char* to some length and need to convert to NSString.
char *val substring Length
I tried
NSString *tempString = [NSString stringWithCString:val encoding:NSAsciiStringEncoding];
NSRange range = NSMakeRange (0, length);
NSString *finalValue = [tempString substringWithRange: range];
This works but not for other special character languages like chinese.
If i convert To UTF8Encoding then substring length will mismatch.
Is there any other way to substring the char* then convert to UTF8 encoding?
You have to use the encoding, the string is encoded in.
In your case, you say to interpret the string as ASCII string. ASCII does not have chinese characters. Therefore this cannot work with chinese characters: They are not there.
Likely you have an UTF8 encoded string. But simply switching to UTF8 does not help. Since NSString and OS X/iOS at all encodes 16-Bit Unicode, but extended Unicode has 20 bits, chinese characters needs multiple codes. This has some effects, for example -length returns the number of codes, not the number of chinese characters. However, with -rangeOfComposedCharacterSequencesForRange: you can adjust the range.
For example 𠀖 (CJK unified ideograph-0x20016):
NSString *str = #"𠀖"; // One chinese whatever
NSLog(#"%ld", [str length]); // This are "2" characters
NSRange range = {0, 1}; // Range for the "first" character
NSLog(#"%ld %ld", range.location, range.length); // 0 1
range = [str rangeOfComposedCharacterSequencesForRange:range];
NSLog(#"%ld %ld", range.location, range.length); // 0 2
You can get a better answer, if you add information about the encoding of the string coming in and the required encoding for putting out.
Strings are not UTF8 or whatever strings. Strings are strings. Their storage, their representation in computer memory has an encoding, but they don't have an encoding themselves.
I found the solution for my question
char subString[length+1];
strncpy(subString, val, length);
subString[length] = '\0'; // place the null terminator
NSString *finalString = [NSString stringWithCString: subString encoding:NSUTF8StringEncoding];
I did the char* sub string and UTF8 encoding both.
I'm wondering, is there a way to detect a character that takes up more than 1 index spot in an NSString? (like an emoji). I'm trying to implement a custom text view and when the user pushes delete, I need to know if I should delete only the previous one index spot or more.
Actually NSString use UTF-16.So it is quite difficult to work with characters which takes two UTF-16 charater(unichar) or more.But you can do with rangeOfComposedCharacterSequenceAtIndexto get range and than delete.
First find the last character index from string
NSUInteger lastCharIndex = [str length] - 1;
Than get the range of last character
NSRange lastCharRange = [str rangeOfComposedCharacterSequenceAtIndex: lastCharIndex];
Than delete with range from character (If it is of two UTF-16 than it deletes UTF-16)
deletedLastCharString = [str substringToIndex: lastCharRange.location];
You can use this method with any type of characters which takes any number of unichar
For one you could transform the string to a sequence of characters using [myString UTF8String] and you can then check if the character has its first bit set to one or zero. If its one then this is a UTF8 character and you can then check how many bytes are there to this character. Details about UTF8 can be found on Wikipedia - UTF8. Here is a simple example:
NSString *string = #"ČTest";
const char *str = [string UTF8String];
NSMutableString *ASCIIStr = [NSMutableString string];
for (int i = 0; i < strlen(str); ++i)
if (!(str[i] & 128))
[ASCIIStr appendFormat:#"%c", str[i]];
NSLog(#"%#", ASCIIStr); //Should contain only ASCII characters
I have seen questions in stackoverflow that convert unichar to NSString but now I would like to do the reverse.
How do i do it?
Need some guidance.. Thanks
For example, I have an array of strings:[#"o",#"p",#"q"];
These are strings inside. How do i convert it back to unichar?
The following will work as long as the first character isn't actually two composed characters (in other words as long as the character doesn't have a Unicode value greater than \UFFFF):
unichar ch = [someString characterAtIndex:0];
You could convert it to a buffer in NSData:
if ([string canBeConvertedToEncoding:NSUnicodeStringEncoding]) {
NSData * data = [string dataUsingEncoding:NSUnicodeStringEncoding];
const unichar* const ptr = (const unichar*)data.bytes;
...
}
I am generating QR code and everything is working fine if text is only in English. When i want to generate QR code with some Arabic text then it fails at NSString's method "getCString:maxLength:encoding:".
Suppose, I have two strings:
NSString *englishText = #"Some text English";
NSString *englishArabicMixText = #"Some text بالعربي";
char strEng [[englishText length] + 1];
char strArb [[englishArabicMixText length] + 1];
1- [englishText getCString:strEng maxLength:[englishText length] + 1 encoding:NSUTF8StringEncoding];
2- [englishArabicMixText getCString:strArb maxLength:[englishArabicMixText length] + 1 encoding:NSUTF8StringEncoding];
At Case#1 'getCString' return true and QR code is generated and at Case#2 it return false and failed to generate code.
What should I do, so that in case#2 it should also return true ? Thank you
length returns the number of Unicode characters. You have to use lengthOfBytesUsingEncoding:, which returns the number of bytes required to store the receiver in a given encoding.
NSUInteger arbLength = [englishArabicMixText lengthOfBytesUsingEncoding:NSUTF8StringEncoding] + 1;
char strArb [arbLength];
[englishArabicMixText getCString:strArb maxLength:arbLength encoding:NSUTF8StringEncoding];
2 is returning false for either 2 possible reasons:
1) the string cannot be converted with the specified encoding.
2) the buffer to hold the encoded string is too small.
I'd guess (or at least I suggest you to start investigating) problem is nr. 2.
Because as you're converting to UTF8 a single un-encoded character may result in more than one encoded character. An 'A' is a single byte with value 65 but an arabic character or some kind of symbol may require more bytes.
You are assuming your destination buffer requires the same number of bytes as the same number of characters of your NSString
So you should do something like that:
NSUInteger size = [englishArabicMixText lengthOfBytesUsingEncoding : NSUTF8StringEncoding];
if(size>0)
{
size++;
strArb = malloc(size); // NOTE: you should allocate space for your string at runtime!!
[englishArabicMixText getCString:strArb maxLength:size encoding:NSUTF8StringEncoding];
}
You should do the same for the plain english string too.
And I'd reccomend to allocate dinamically at runtime the space for the C string with malloc and then free it when you don't need it anymore.