Unicode to UTF8 in Swift - ios

I am using the Maps API and when searching for some addresses in foreign countries, the address comes back with Unicode characters embedded like this:
"Place du Panth\U00e9on",
"75005 Paris"
The unicode character in this instance is \U00e9 which is é
The trouble I have having is that SwiftyJSON pukes if I have saved this data in a JSON file and try to read it back. SwiftyJSON does not like the back slash character '\' The JSON is valid and even if I could read it, it is still not good as I would rather have é displayed properly as well as all other Unicode characters.
Does anyone have any ideas on how to convert all unicode characters to UTF8 encoding of that character in Swift?
Should I just write a function that searches for all of the Unicode characters and then convert them?

Unless someone has a better idea, I just wrote this function that is doing the trick for me now.
func convertFromUnicode(var myString:String) -> String {
let convertDict:[String:String] = ["\\U00c0":"À", "\\U00c1" :"Á","\\U00c2":"Â","\\U00c3":"Ã","\\U00c4":"Ä","\\U00c5":"Å","\\U00c6":"Æ","\\U00c7":"Ç","\\U00c8":"È","\\U00c9":"É","\\U00ca":"Ê","\\U00cb":"Ë","\\U00cc":"Ì","\\U00cd":"Í","\\U00ce":"Î","\\U00cf":"Ï","\\U00d1":"Ñ","\\U00d2":"Ò","\\U00d3":"Ó","\\U00d4":"Ô","\\U00d5":"Õ","\\U00d6":"Ö","\\U00d8":"Ø","\\U00d9":"Ù","\\U00da":"Ú","\\U00db":"Û","\\U00dc":"Ü","\\U00dd":"Ý","\\U00df":"ß","\\U00e0":"à","\\U00e1":"á","\\U00e2":"â","\\U00e3":"ã","\\U00e4":"ä","\\U00e5":"å","\\U00e6":"æ","\\U00e7":"ç","\\U00e8":"è","\\U00e9":"é","\\U00ea":"ê","\\U00eb":"ë","\\U00ec":"ì","\\U00ed":"í","\\U00ee":"î","\\U00ef":"ï","\\U00f0":"ð","\\U00f1":"ñ","\\U00f2":"ò","\\U00f3":"ó","\\U00f4":"ô","\\U00f5":"õ","\\U00f6":"ö","\\U00f8":"ø","\\U00f9":"ù","\\U00fa":"ú","\\U00fb":"û","\\U00fc":"ü","\\U00fd":"ý","\\U00ff":"ÿ"]
for (key,value) in convertDict {
myString = myString.stringByReplacingOccurrencesOfString(key, withString: value)
}
return myString
}

Instead of hardcoding all the characters I would decode it with an extension like:
extension String {
var decoded : String {
let data = self.data(using: .utf8)
let message = String(data: data!, encoding: .nonLossyASCII) ?? ""
return message
}
}
and then you could use it like this:
let myString = "Place du Panth\\U00e9on"
print(myString.decoded)
Which would print Place du Panthéon

Related

How to identify UTF-8 encoded text from a string and convert it to smiley\emoticon in Swift

I am doing App which support Smiley/emoticons feature. From the backend I am getting response like this str = "Hferuhggeðððððfjjnjrnjgnejfnsgjen".
This string response has a UTF-8 encoded text in it, for the above str UTF-8 encode text is "ððððð".
Now I need to identify the location of the utf-8 encoded text from the response obtained, and convert that encoded text to an emoticon/smiley.
Finally I found solution if you decode string you will get smiley ,please find the code
let che = descriptionText.cString(using: .isoLatin1)
let decode_string = String(cString: che!, encoding: .utf8)
This worked for me.

Emojis showing up as question marks in app made using swift(iOS), java(android), ruby(server), mongodb(database)

I've been working on this chatting application in which users can send emojis. Now I'm taking the string from whatever user enters in the UITextField and put it in NSDictionary and sending it to the server as json. And that json is sent to the server where the message is read as string in ruby and then stored in mongodb. Now when the other client make the get messages api call, the emojis are showing up as a box or a '?'.
P.S : only emojis with 5 character code shows up like that for eg: \u1F602
but the emojis with 4 character code shows up fine for eg: \u2764
Now I don't know if the problem is client or server or the database so I don't know which code to add here. Please add in comments the code you need I'll post it here.
It feels like the problem is server, cause the problem is caused in both android and iOS devices.
Have been banging my head on this for more than a month now. Would love if someone can help.
Thanks
----EDIT----
I understand that in ruby \u{1F602} works but I don't know how to make the clients send it in that format. I'm just taking whatever user types in the UITextField(for iOS) and EditText(for Android) and sending them as it is.
Is there a way I can make that change in client or fix it on server somehow?
In iOS,
For encode emojis to unicode use below code:
let msg:String = "😂😂"
extension String {
var encodeEmoji: String{
if let encodeStr = NSString(cString: self.cString(using: .nonLossyASCII)!, encoding: String.Encoding.utf8.rawValue){
return encodeStr as String
}
return self
}
}
let msgdata:String = msg.encodeEmoji
send encoded string to server..
For decode unicode to emojis use below code:
While getting your responce from the server which is unicode.
decode that unicode to Emoji with below code
extension String {
var decodeEmoji: String{
let data = self.data(using: String.Encoding.utf8);
let decodedStr = NSString(data: data!, encoding: String.Encoding.nonLossyASCII.rawValue)
if let str = decodedStr{
return str as String
}
return self
}
}
let decodedstring = "Your Unicode String".decodeEmoji
If anyone is looking for an answer. This is how I fixed it
For Android:
in your gradle file add the following dependency
compile 'org.apache.commons:commons-lang3:3.6'
and then while sending a message to server encode the text using this
String msg = "User message here with emoji";
msg = StringEscapeUtils.escapeJava(msg);
and to decode the message after receiving from the server, use the following command
String msg = "User message here with emoji";
msg = StringEscapeUtils.unescapeJava(text);
For iOS:
(Using #Ankit Chauhan's answer)
let msg:String = "😂😂"
extension String {
var encodeEmoji: String{
if let encodeStr = NSString(cString: self.cString(using: .nonLossyASCII)!, encoding: String.Encoding.utf8.rawValue){
return encodeStr as String
}
return self
}
}
let msgdata:String = msg.encodeEmoji
And to decode use this:
extension String {
var decodeEmoji: String{
let data = self.data(using: String.Encoding.utf8);
let decodedStr = NSString(data: data!, encoding: String.Encoding.nonLossyASCII.rawValue)
if let str = decodedStr{
return str as String
}
return self
}
}
let decodedstring = "Your Unicode String".decodeEmoji
The simplified notation without curly brackets assumes there are four digits following \u. For 3-bytes one should use the complete expression:
▶ "\u{1F602}"
#⇒ "😂"
For Android I had to do 2 things.
Use escapeJava when i send msg to server and unescapeJava when receive msg
'org.apache.commons:commons-lang3' //deprecated, dont use
org.apache.commons:commons-text:1.2'
StringEscapeUtils.escapeJava(message)
StringEscapeUtils.unescapeJava(message)
Use EmojiCompat library from google
https://developer.android.com/guide/topics/ui/look-and-feel/emoji-compat.html

String passed to web service inserts Unicode escape

I have a location start and destination which come from LocationServices framework.
The address line is a Swift String. However, when it is passed to a service as a parameter, it goes with \U2013 inserted.
Example:
If the string is "100198 Commerce St", it would be like
"100\U2013198 Commerce St"
\U2013 is getting inserted and I don't have any idea from where.
You are trying to convert the Unicode string to a 8-bit string. Try this function to get your required string.
func convertString(string: String) -> String {
if let data = string.data(using: String.Encoding.ascii, allowLossyConversion: true) {
return String.init(data: data, encoding: .ascii)!
}
return ""
}
The address line has a unicode character and when you pass it to your service, it is encoded with non-unicode (probably UTF-8) encoding.
That is a broad topic, I'd suggest you looking into string encoding subject in general.

I receive an improperly formatted unicode in a String

I am working with a web API that gives me strings like the following:
"Eat pok\u00e9."
Xcode complains that
Expected Hexadecimal code in braces after unicode escape
My understanding is that it should be converted to pok\u{00e9}, but I do not know how to achieve this.
Can anybody point me in the right direction for me develop a way of converting these as there are many in this API?
Bonus:
I also need to remove \n from the strings.
You may want to give us more context regarding what the raw server payload looked like, and show us how you're displaying the string. Some ways of examining strings in the debugger (or if you're looking at raw JSON) will show you escape strings, but if you use the string in the app, you'll see the actual Unicode character.
I wonder if you're just looking at raw JSON.
For example, I passed the JSON, {"foo": "Eat pok\u00e9."} to the following code:
let jsonString = String(data: data, encoding: NSUTF8StringEncoding)!
print(jsonString)
let dictionary = try! NSJSONSerialization.JSONObjectWithData(data, options: []) as! [String: String]
print(dictionary["foo"]!)
And it output:
{"foo": "Eat pok\u00e9."}
Eat poké.
By the way, this standard JSON escape syntax should not be confused with Swift's string literal escape syntax, in which the hex sequence must be wrapped in braces:
print("Eat pok\u{00e9}.")
Swift uses a different escape syntax in their string literals, and it should not be confused with that employed by formats like JSON.
#Rob has an excellent solution for the server passing invalid Swift String literals.
If you need to convert "Eat pok\u00e9.\n" to Eat poké it can be done as follows with Swift 3 regex.
var input = "Eat pok\\u00e9.\n"
// removes newline
input = String(input.characters.map {
$0 == "\n" ? " " : $0
})
// regex helper function for sanity's sake
func regexGroup(for regex: String!, in text: String!) -> String {
do {
let regex = try RegularExpression(pattern: regex, options: [])
let nsString = NSString(string: text)
let results = regex.matches(in: text, options: [], range: NSMakeRange(0, nsString.length))
let group = nsString.substring(with: results[0].range)
return group
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return ""
}
}
let unicodeHexStr = regexGroup(for:"0\\w*", in: input)
let unicodeHex = Int(unicodeHexStr, radix: 16)!
let char = Character(UnicodeScalar(unicodeHex)!)
let replaced = input.stringByReplacingOccurrencesOfString("\\u"+unicodeHexStr, withString: String(char))
// prints "Eat poké"
print(replaced)
\u{00e9} is a formatting that's specific to Swift String literals. When the code is compiled, this notation is parsed and converted into the actual Unicode Scalar it represents.
What you've received is a String that escapes Unicode scalars in a particlar way. Transform those escaped Unicode Scalars into the Unicode Scalars they represent, see this answer.

How to deal with a user input string that gives an "unprintable ascii character found in source file" error when pasted into Xcode?

I am working on an app that lets the user paste in text and then the app processes that text.
With a certain text string I am getting an "unprintable ascii character found in source file" error. The unprintable character appears to be a tab, but I'm not sure. Anyway, it is causing problems when I try to process the text.
How can I filter out this or other unprintable characters when I first save the string in a variable?
Or is there another way to deal with this?
Here's another way to do it.
This version also allows new line characters.
func convertString(string: String) -> String {
var data = string.dataUsingEncoding(NSASCIIStringEncoding, allowLossyConversion: true)
return NSString(data: data!, encoding: NSASCIIStringEncoding) as! String
}
If you are only interested in keeping printable ASCII characters, then this code should work.
extension String {
func printableAscii() -> String {
return String(bytes: filter(self.utf8){$0 >= 32}, encoding: NSUTF8StringEncoding) ?? ""
}
}
Note this will filter tabs and line feeds too which may not be expected. Unprintable ASCII are any values less than 0x20. Here is a Playground screen capture.

Resources