How to convert string to utf8 encoding? - ios

I am reading my app directory like this
NSArray *pathss = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectorys = [pathss objectAtIndex:0];
NSError * error;
NSMutableArray * directoryContents = [[NSFileManager defaultManager]
contentsOfDirectoryAtPath:documentsDirectorys error:&error];
the output i get:
"Forms_formatted.pdf",
"fund con u\U0308u\U0308.pdf",
"hackermonthly-issue.pdf",
these are the files name. my question, how come i able to convert this name "fund con u\U0308u\U0308.pdf" to correct format. thanks in advance

Maybe this will help:
NSString *documentsDirectorys = [pathss objectAtIndex:0];
const char *documentsDirectoryCstring = [documentsDirectorys cStringUsingEncoding:NSASCIIStringEncoding];
EDIT:
const char *documentsDirectoryCstring = [documentsDirectorys cStringUsingEncoding:NSUTF8StringEncoding];

Convert the NSString to NSData with UTF-16 encoding, convert back to a NSString with UTF-8 encoding.
It is probably UTF-16 big endian with no BOM. The character is not ASCII.
The character u\U0308 is not ASCII, it is a Combining Diacritical Mark ̈ü, the character is also known as double dot above, umlaut, Greek dialytika and double derivative. The \U0308 puts the umlaut above the u character.

You can try this. NSLog use description method of NSArray to print NSArray object, which deals with unicode characters differently than the description on objects its contains such as NSString objects. Or you can loop through the array and NSLog each item.
NSLog(#"%#", [NSString stringWithCString:[[array description] cStringUsingEncoding:NSASCIIStringEncoding] encoding:NSNonLossyASCIIStringEncoding]);

Related

NSString stringWithCString not showing special characters?

I'm using
[NSString stringWithFormat:#"%C",(unichar)decimalValueX];
but I have to call it thousands of times and its simply too slow.
As an alternative I tried this:
sprintf (cString, "%C", (unichar)decimalValueX);
[NSString stringWithCString:cString encoding:NSUTF16StringEncoding];
but no characters are correctly transalted.
If I try UTF8 instead of 16:
sprintf (cString, "%C", (unichar)decimalValueX);
[NSString stringWithCString:cString encoding:NSUTF8StringEncoding];
I get alphanumeric, but I don't get foreign characters or other special characters.
Can anyone explain whats going on? Or how to make stringWithFormat faster?
Thanks!
It seems that the %C format does not work with sprintf and related functions and non-ASCII characters. But there is a simpler method:
stringWithCharacters:length:
creates an NSString directly from a unichar array (UTF-16 code points).
For a single unichar this would be just
NSString *string = [NSString stringWithCharacters:&decimalValueX length:1];
Example:
unichar decimalValueX = 8364; // The Euro character
NSString *string = [NSString stringWithCharacters:&decimalValueX length:1];
NSLog(#"%#", string); // €
Example for multiple UTF-16 code points:
unichar utf16[] = { 945, 946, 947 };
NSString *string3 = [NSString stringWithCharacters:utf16 length:3];
NSLog(#"%#", string3); // αβγ
For characters outside of the "basic multilingual plane" (i.e.
characters > U+FFFF) you would have to use 2 UTF-16 code points
per character (surrogate pair).
Or use a different API like
uint32_t utf32[] = { 128123, 128121 };
NSString *string4 = [[NSString alloc] initWithBytes:utf32 length:2*4 encoding:NSUTF32LittleEndianStringEncoding];
NSLog(#"%#", string4); // 👻👹

Sorting in NSHomeDirectory

I have this code to save images in my app
NSString *fileName = [NSString stringWithFormat:#"2013_%d_a_%d",count,indexToInsert];
NSString *pngPath = [NSHomeDirectory() stringByAppendingPathComponent:[#"Documents/" stringByAppendingString:fileName]];
NSData *imageData = UIImagePNGRepresentation(imageToAdd);
[imageData writeToFile:pngPath atomically:YES];
in my log I see this:
"2013_10_a_1",
"2013_1_a_1",
"2013_2_a_1",
"2013_3_a_1",
"2013_4_a_1",
"2013_5_a_1",
"2013_6_a_1",
"2013_7_a_1",
"2013_8_a_1",
"2013_9_a_1"
why "2013_10_1" is on the top? it's in position 0, I want it at position 9 (10 elements)
The issue here is that the underscore character _ (ascii code 95) is sorted after any number character (ascii codes 48 to 57).
Change the output filename to include leading zero and you won't need to mess with sorting issues:
NSString *fileName = [NSString stringWithFormat:#"2013_%03d_a_%d",count,indexToInsert];
Will output:
"2013_001_a_1",
"2013_002_a_1",
"2013_003_a_1",
"2013_004_a_1",
"2013_005_a_1",
"2013_006_a_1",
"2013_007_a_1",
"2013_008_a_1",
"2013_009_a_1",
"2013_010_a_1"
Your strings contain numbers so you need to do a numeric sort, not a plain string sort. For this, use the compare:options: method on NSString with an option of NSNumericSearch.

NSString with Japanese Chars

In my application i have NSString that get String from the web:
高瀬 - 虎柄の毘沙門天
Now i want to copy this string to a local NSString in my Object so i wrote:
self.metaDataString = [NSString stringWithString:tempMetaDataString];
And now in metaDataString i have :
é«ç¬ - èæã®æ¯æ²é天
What can make this problem?
i tried this too:
self.metaDataString = [NSString stringWithUTF8String:[tempMetaDataString UTF8String]];
How i get tempMetaDataString:
NSMutableString *tempMetaDataString = [NSMutableString stringWithCapacity:0];
//This line i loop over the bytes array size
[tempMetaDataString appendFormat:#"%c", bytes[i]];
And this is the bytes array:
UInt8 bytes[kAQBufSize];
length = CFReadStreamRead(stream, bytes, kAQBufSize);
This line cannot work for multi-byte characters:
[tempMetaDataString appendFormat:#"%c", bytes[i]];
If you have a multi-byte character, this is going to split it up into individual ASCII characters (as you're seeing).
It's unclear from this code what bytes really is. Is the string of a fixed length, or is the string NULL terminated? If it's of a fixed length, then you want (assuming this is UTF8):
self.metaDataString = [[NSString alloc] initWithBytes:bytes
length:kAQBufSize
encoding:NSUTF8StringEncoding];
If this is a NULL terminated UTF8 string:
self.metaDataString = [NSString stringWithUTF8String:bytes];
If some other encoding (for example NSJapaneseEUCStringEncoding or NSShiftJISStringEncoding):
self.metaDataString = [NSString stringWithCString:bytes encoding:theEncoding];

Parsing and processing Text Strings in iOS

Wanted to find the best programming approach in iOS to manipulate and process text strings. Thanks!
Would like to take a file with strings to manipulate the characters similar to the following:
NQXB26JT1RKLP9VHarren Daggett B0BMAF00SSQ ME03B98TBAA8D
NBQB25KT1RKLP05Billison Whiner X0AMAF00UWE 8E21B98TBAF8W
...
...
...
Each string would process in series then loop to the next string, etc.
Strip out the name and the following strings:
Take the following 3 string fragments and convert to another number base. Have the code to process the new result but unsure of how to send these short strings to be processed in series.
QXB26
B0BM
BAA8
Then output the results to a file. The xxx represents the converted numbers.
xxxxxxxxx Harren Daggett xxxxxxxx xxxxxxxx
xxxxxxxxx Billison Whiner xxxxxxxx xxxxxxxx
...
...
...
The end result would be pulling parts of strings out of the first file and create a new file with the desired result.
There are several ways to accomplish what you are after, but if you want something simple and reasonably easy to debug, you could simply split up each record by the fixed position of each of the fields you have identified (the numbers, the name), then use a simple regular expression replace to condense the name and put it all back together.
For purposes like this I prefer a simple (and even a bit pedestrian) solution that is easy to follow and debug, so this example is not optimised:
NSFileManager *fm = [NSFileManager defaultManager];
NSArray *URLs = [fm URLsForDirectory: NSDocumentDirectory
inDomains: NSUserDomainMask];
NSURL *workingdirURL = URLs.lastObject;
NSURL *inputFileURL = [workingdirURL URLByAppendingPathComponent:#"input.txt" isDirectory:NO];
NSURL *outputFileURL = [workingdirURL URLByAppendingPathComponent:#"output.txt" isDirectory:NO];
// For the purpose of this example, just read it all in one chunk
NSError *error;
NSString *stringFromFileAtURL = [[NSString alloc]
initWithContentsOfURL:inputFileURL
encoding:NSUTF8StringEncoding
error:&error];
if ( !stringFromFileAtURL) {
// Error, do something more intelligent that just returning
return;
}
NSArray *records = [stringFromFileAtURL componentsSeparatedByCharactersInSet: [NSCharacterSet newlineCharacterSet]];
NSMutableArray *newRecords = [NSMutableArray array];
for (NSString *record in records) {
NSString *firstNumberString = [record substringWithRange:NSMakeRange(1, 5)];
NSString *nameString = [record substringWithRange:NSMakeRange(15, 27)];
NSString *secondNumberString = [record substringWithRange:NSMakeRange(43, 4)];
NSString *thirdNumberString = [record substringWithRange:NSMakeRange(65, 4)];
NSString *condensedNameString = [nameString stringByReplacingOccurrencesOfString:#" +"
withString:#" "
options:NSRegularExpressionSearch
range:NSMakeRange(0, nameString.length)];
NSString *newRecord = [NSString stringWithFormat: #"%# %# %# %#",
convertNumberString(firstNumberString),
condensedNameString,
convertNumberString(secondNumberString),
convertNumberString(thirdNumberString) ];
[newRecords addObject: newRecord];
}
NSString *outputString = [newRecords componentsJoinedByString:#"\n"];
[outputString writeToURL: outputFileURL
atomically: YES
encoding: NSUTF8StringEncoding
error: &error];
In this example convertNumberString is a plain C function that converts your number strings. It could of course also be a method, depending on the architecture or your preferences.

Encrypted twitter feed

I'm developing an iOS application , that will take a twits from twitter,
I'm using the following API
https://api.twitter.com/1/statuses/user_timeline.json?include_entities=true&include_rts=true&count=2&screen_name=TareqAlSuwaidan
The problem are feed in Arabic Language ,
i.e the text feed appears like this
\u0623\u0646\u0643 \u0648\u0627\u0647\u0645
How can i get the real text (or how to encode this to get real text) ?
This is not encrypted, it is unicode. The codes 0600 - 06ff is Arabic. NSString handles unicode.
Here is an example:
NSString *string = #"\u0623\u0646\u0643 \u0648\u0627\u0647\u0645";
NSLog(#"string: '%#'", string);
NSLog output:
string: 'أنك واهم'
The only question is exactly what problem are you seeing, are you getting the Arabic text? Are you using NSJSONSerialization to deserialize the JSON? If so there should be no problem.
Here is an example with the question URL (don't use synchronous requests in production code):
NSURL *url = [NSURL URLWithString:#"https://api.twitter.com/1/statuses/user_timeline.json?include_entities=true&include_rts=true&count=2&screen_name=TareqAlSuwaidan"];
NSData *data = [NSData dataWithContentsOfURL:url];
NSError *error;
NSArray *jsonObject = [NSJSONSerialization JSONObjectWithData:data options:NSJSONReadingMutableContainers error:&error];
NSDictionary *object1 = [jsonObject objectAtIndex:0];
NSString *text = [object1 objectForKey:#"text"];
NSLog(#"text: '%#'", text);
NSLog output:
text: '#Naser_Albdya أيدت الثورة السورية منذ بدايتها وارجع لليوتوب واكتب( سوريا السويدان )
Those are Unicode literals. I think all that's needed is to use NSString's stringWithUTF8String: method on the string you have. That should use NSString's native Unicode handling to convert the literals to the actual characters. Example:
NSString *directFromTwitter = [twitterInterface getTweet];
// directFromTwitter contains "\u0623\u0646\u0643 \u0648\u0627\u0647\u0645"
NSString *encodedString = [NSString stringWithUTF8String:[directFromTwitter UTF8String]];
// encodedString contains "أنك واهم", or something like it
The method call inside the conversion call ([directFromTwitter UTF8String]) is to get access to the raw bytes of the string, that are used by stringWithUTF8String. I'm not exactly sure on what those code points come out to, I just relied on Python to do the conversion.

Resources