How to compose multi-character emoji from raw hex

How to compose multi-character emoji from raw hex - ios

I'm getting JSON like this from the server:
{
"unicode":"1f468-1f468-1f467-1f467"
}
and I'm supposed to translate it into its composite character for display and/or copying to the pasteboard: 👨‍👨‍👧‍👧
The solution so far comes from this SO question:
let u = json["unicode"] as? String
let dashless = u.characters.split{$0 == "-"}.map(String.init)
let charArray = dashless.map { char -> Character in
let code = Int(strtoul(char, nil, 16))
return Character(UnicodeScalar(code))
}
let unicode = String(charArray)
UIPasteboard.generalPasteboard().string = unicode
This works great for single-character emoji definitions.
E.g., I can run the code above with this JSON…
{
"unicode":"1f4a9"
}
…and paste the expected result: 💩. But when I do with the mmgg family emoji listed earlier, I get the following in iOS, minus the spaces: 👨‍ 👨‍ 👧‍ 👧. They just don't seem to want to combine when pasted into a text field.
Is this an iOS bug, or am I doing something wrong?

try this in your playground, to see the difference ...
"👨👨👧👧".unicodeScalars.forEach { (c) in
print(c.escape(asASCII: true),terminator: "")
}
print("")
"👨‍👨‍👧‍👧".unicodeScalars.forEach { (c) in
print(c.escape(asASCII: true), terminator: "")
}
/*
\u{0001F468}\u{0001F468}\u{0001F467}\u{0001F467}
\u{0001F468}\u{200D}\u{0001F468}\u{200D}\u{0001F467}\u{200D}\u{0001F467}
*/
your original, slightly modified code
import Darwin // stroul
let u = "1f468-1f468-1f467-1f467"
let dashless = u.characters.split{$0 == "-"}.map(String.init)
let emoji = dashless.map { char -> String in
let code = Int(strtoul(char, nil, 16))
return String(UnicodeScalar(code))
}.joinWithSeparator("\u{200D}")
print(emoji) // 👨‍👨‍👧‍👧
pure Swift code, no Foundation, without strtoul
let u = "1f468-1f468-1f467-1f467"
let emoji = u.characters.split("-")
.map {String(UnicodeScalar(Int(String($0),radix: 16) ?? 0))}
.joinWithSeparator("\u{200D}")
print(emoji) // 👨‍👨‍👧‍👧

Related

Remove special characters from the string

I am trying to use an iOS app to dial a number. The problem is that the number is in the following format:
po placeAnnotation.mapItem.phoneNumber!
"‎+1 (832) 831-6486"
I want to get rid of some special characters and I want the following:
832-831-6486
I used the following code but it did not remove anything:
let charactersToRemove = CharacterSet(charactersIn: "()+-")
var telephone = placeAnnotation.mapItem.phoneNumber?.trimmingCharacters(in: charactersToRemove)
Any ideas?

placeAnnotation.mapItem.phoneNumber!.components(separatedBy: CharacterSet.decimalDigits.inverted)
.joined()
Here you go!
I tested and works well.

If you want something similar to CharacterSet with some flexibility, this should work:
let phoneNumber = "1 (832) 831-6486"
let charsToRemove: Set<Character> = Set("()+-".characters)
let newNumberCharacters = String(phoneNumber.characters.filter { !charsToRemove.contains($0) })
print(newNumberCharacters) //prints 1 832 8316486

I know the question is already answered, but to format phone numbers in any way one could use a custom formatter like below
class PhoneNumberFormatter:Formatter
{
var numberFormat:String = "(###) ### ####"
override func string(for obj: Any?) -> String? {
if let number = obj as? NSNumber
{
var input = number as Int64
var output = numberFormat
while output.characters.contains("#")
{
if let range = output.range(of: "#", options: .backwards)
{
output = output.replacingCharacters(in: range, with: "\(input % 10)")
input /= 10
}
else
{
output.replacingOccurrences(of: "#", with: "")
}
}
return output
}
return nil
}
func string(from number:NSNumber) -> String?
{
return string(for: number)
}
}
let phoneNumberFormatter = PhoneNumberFormatter()
//Digits will be filled backwards in place of hashes. It is easy change the custom formatter in anyway
phoneNumberFormatter.numberFormat = "###-##-##-##-##"
phoneNumberFormatter.string(from: 18063783889)

Swift 3
func removeSpecialCharsFromString(_ str: String) -> String {
struct Constants {
static let validChars = Set("1234567890-".characters)
}
return String(str.characters.filter { Constants.validChars.contains($0) })
}
To Use
let str : String = "+1 (832) 831-6486"
let newStr : String = self.removeSpecialCharsFromString(str)
print(newStr)
Note: you can add validChars which you want in string after operation perform.

If you have the number and special character in String format the use following code to remove special character
let numberWithSpecialChar = "1800-180-0000"
let actulNumber = numberWithSpecialChar.components(separatedBy: CharcterSet.decimalDigit.inverted).joined()
Otherwise, If you have the characters and special character in String format the use following code to remove special character
let charactersWithSpecialChar = "A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal-Panama!"
let actulString = charactersWithSpecialChar.components(separatedBy: CharacterSet.letters.inverted).joined(separator: " ")

NSString *str = #"(123)-456-7890";
NSLog(#"String: %#", str);
// Create character set with specified characters
NSMutableCharacterSet *characterSet =
[NSMutableCharacterSet characterSetWithCharactersInString:#"()-"];
// Build array of components using specified characters as separtors
NSArray *arrayOfComponents = [str componentsSeparatedByCharactersInSet:characterSet];
// Create string from the array components
NSString *strOutput = [arrayOfComponents componentsJoinedByString:#""];
NSLog(#"New string: %#", strOutput);

String with Unicode (variable) [duplicate]

I have a problem I couldn't find a solution to.
I have a string variable holding the unicode "1f44d" and I want to convert it to a unicode character 👍.
Usually one would do something like this:
println("\u{1f44d}") // 👍
Here is what I mean:
let charAsString = "1f44d" // code in variable
println("\u{\(charAsString)}") // not working
I have tried several other ways but somehow the workings behind this magic stay hidden for me.
One should imagine the value of charAsString coming from an API call or from another object.

One possible solution (explanations "inline"):
let charAsString = "1f44d"
// Convert hex string to numeric value first:
var charCode : UInt32 = 0
let scanner = NSScanner(string: charAsString)
if scanner.scanHexInt(&charCode) {
// Create string from Unicode code point:
let str = String(UnicodeScalar(charCode))
println(str) // 👍
} else {
println("invalid input")
}
Slightly simpler with Swift 2:
let charAsString = "1f44d"
// Convert hex string to numeric value first:
if let charCode = UInt32(charAsString, radix: 16) {
// Create string from Unicode code point:
let str = String(UnicodeScalar(charCode))
print(str) // 👍
} else {
print("invalid input")
}
Note also that not all code points are valid Unicode scalars,
compare Validate Unicode code point in Swift.
Update for Swift 3:
public init?(_ v: UInt32)
is now a failable initializer of UnicodeScalar and checks if the
given numeric input is a valid Unicode scalar value:
let charAsString = "1f44d"
// Convert hex string to numeric value first:
if let charCode = UInt32(charAsString, radix: 16),
let unicode = UnicodeScalar(charCode) {
// Create string from Unicode code point:
let str = String(unicode)
print(str) // 👍
} else {
print("invalid input")
}

This can be done in two steps:
convert charAsString to Int code
convert code to unicode character
Second step can be done e.g. like this
var code = 0x1f44d
var scalar = UnicodeScalar(code)
var string = "\(scalar)"
As for first the step, see here how to convert String in hex representation to Int

As of Swift 2.0, every Int type has an initializer able to take String as an input. You can then easily generate an UnicodeScalar corresponding and print it afterwards. Without having to change your representation of chars as string ;).
UPDATED: Swift 3.0 changed UnicodeScalar initializer
print("\u{1f44d}") // 👍
let charAsString = "1f44d" // code in variable
let charAsInt = Int(charAsString, radix: 16)! // As indicated by #MartinR radix is required, default won't do it
let uScalar = UnicodeScalar(charAsInt)! // In Swift 3.0 this initializer is failible so you'll need either force unwrap or optionnal unwrapping
print("\(uScalar)")

You can use
let char = "-12"
print(char.unicodeScalars.map {$0.value }))
You'll get the values as:
[45, 49, 50]

Here are a couple ways to do it:
let string = "1f44d"
Solution 1:
"&#x\(string);".applyingTransform(.toXMLHex, reverse: true)
Solution 2:
"U+\(string)".applyingTransform(StringTransform("Hex/Unicode"), reverse: true)

I made this extension that works pretty well:
extension String {
var unicode: String? {
if let charCode = UInt32(self, radix: 16),
let unicode = UnicodeScalar(charCode) {
let str = String(unicode)
return str
}
return nil
}
}
How to test it:
if let test = "e9c8".unicode {
print(test)
}
//print:

You cannot use string interpolation in Swift as you try to use it. Therefore, the following code won't compile:
let charAsString = "1f44d"
print("\u{\(charAsString)}")
You will have to convert your string variable into an integer (using init(_:radix:) initializer) then create a Unicode scalar from this integer. The Swift 5 Playground sample code below shows how to proceed:
let validCodeString = "1f44d"
let validUnicodeScalarValue = Int(validCodeString, radix: 16)!
let validUnicodeScalar = Unicode.Scalar(validUnicodeScalarValue)!
print(validUnicodeScalar) // 👍

Check if string latin or cyrillic

Is it some way to check if some string latin or cyrillic? I've tried localizedCompare String method, but it don't gave me needed result.

What about something like this?
extension String {
var isLatin: Bool {
let upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
let lower = "abcdefghijklmnopqrstuvwxyz"
for c in self.characters.map({ String($0) }) {
if !upper.containsString(c) && !lower.containsString(c) {
return false
}
}
return true
}
var isCyrillic: Bool {
let upper = "АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЮЯ"
let lower = "абвгдежзийклмнопрстуфхцчшщьюя"
for c in self.characters.map({ String($0) }) {
if !upper.containsString(c) && !lower.containsString(c) {
return false
}
}
return true
}
var isBothLatinAndCyrillic: Bool {
return self.isLatin && self.isCyrillic
}
}
Usage:
let s = "Hello"
if s.isLatin && !s.isBothLatinAndCyrillic {
// String is latin
} else if s.isCyrillic && !s.isBothLatinAndCyrillic {
// String is cyrillic
} else if s.isBothLatinAndCyrillic {
// String can be either latin or cyrillic
} else {
// String is not latin nor cyrillic
}
Considere there are cases where the given string could be both, for example the string:
let s = "A"
Can be both latin or cyrillic. So that's why there's the function "is both".
And it can also be none of them:
let s = "*"

You should get all unicode characters and detect if contains cyrillic chars or Latin char based on the unicode value. This code is not complet, you can complete it.
let a : String = "ӿ" //unicode value = 04FF
let scalars = a.unicodeScalars
//get unicode value of first char:
let unicodeValue = scalars[scalars.startIndex].value //print 1279, correspondant to 04FF.
Check here for all unicode value (in hexa).
http://jrgraphix.net/r/Unicode/0400-04FF
According to this site, cyrillic value are from 0400 -> 04FF (1024 -> 1279)
this is the code for cyrillic check:
var isCyrillic = true
for (index, unicode) in scalars.enumerate() {
if (unicode.value < 1024 || unicode.value > 1279) {
print("not a cyrillic text")
print(unicode.value)
isCyrillic = false
break
}
}

Surprisingly, there's no easy answer to your question. The Latin alphabet contains more than just A - Z. There are accented characters in French and archaic forms in German, etc. I don't know the Cyrillic alphabet so I'll leave it alone. On top of that, you have to deal with: punctuation (.,?"(), etc.) and white space, emojis, arrows, dingbats... which are language neutral. The complexity can escalate very quickly depending on your requirements.
The answer you accepted is inadequate to say the least: "hello world".isLatin == false since it doesn't deal with white spaces.
Visit a site like this one to learn what ranges contain characters for which language and play with the code below. It's not a complete answer but meant to get you started:
let neutralRanges = [0x20...0x40]
let latinRanges = [0x41...0x5A, 0x61...0x7A, 0xC0...0xFF, 0x100...0x17F]
let cyrillicRanges = [0x400...0x4FF, 0x500...0x52F]
func scalar(scalar: UnicodeScalar, isInRanges ranges: [Range<Int>]) -> Bool {
for r in ranges {
if r ~= Int(scalar.value) {
return true
}
}
return false
}
let str = "Hello world"
var isLatin = true
var isCyrillic = true
for s in "Hello world".unicodeScalars {
if scalar(s, isInRanges: neutralRanges) {
continue
}
else if !scalar(s, isInRanges: latinRanges) {
isLatin = false
}
else if !scalar(s, isInRanges: cyrillicRanges) {
isCyrillic = false
}
}
print(isLatin)
print(isCyrillic)

A couple of comments refer to another post that shows a fairly clean way to determine the language of a String using NSLinguisticTagger (How to detect text (string) language in iOS? ).
NSLinguisticTagger is definitely the best approach here and is intended exactly for this purpose, but it sounds to me like you're actually asking how to identify the script of the String rather than the language. English, French, German (for example) all use Latin script so the language example above doesn't show the ideal way to discern between Latin and Cyrillic (or other scripts).
Instead I wrote the following extension to String that shows how to identify the script for the first sentence in the String you supply - you can then easily adapt/build on this to get the exact thing you want for your use case:
import Foundation // Needed for NSLinguisticTagger
extension String {
func scriptCode() -> NSLinguisticTag? {
let linguisticTagger = NSLinguisticTagger(tagSchemes: [.script], options: 0)
linguisticTagger.string = self
return iso15924ScriptCode = linguisticTagger.tag(at: 0, unit: .sentence, scheme: .script, tokenRange: nil)
}
}
Scripts are uniformly described by four-letter ISO 15924 script codes, such as "Latn", and this is what you get with the returned NSLinguisticTag object. To perform a comparison, just check the raw value of NSLinguisticTag, for example like this:
if yourTestSentence.scriptCode()? == "Latn" || "Cyrl" {
print("This sentence is in Latin or Cyrillic script")
} else {
print("Some other script")
}
Caveat: This example only checks the first sentence of whatever string you supply. I haven't tested what happens if that sentence is mixed scripts - most likely the returned tag will be nil.
Here are some useful reference links to Apple's docs, and Wikipedia for more info:
https://developer.apple.com/documentation/foundation/nslinguistictagger
https://developer.apple.com/documentation/foundation/nslinguistictagscheme
https://en.wikipedia.org/wiki/ISO_15924

I hope that this also can be useful
let cyrillicToLatinMap: [Character : String] = [
" ":" ",
"А":"A",
"Б":"B",
"В":"V",
"Г":"G",
"Д":"D",
"Е":"E",
"Ж":"Zh",
"З":"Z",
"И":"I",
"Й":"Y",
"К":"K",
"Л":"L",
"М":"M",
"Н":"N",
"О":"O",
"П":"P",
"Р":"R",
"С":"S",
"Т":"T",
"У":"U",
"Ф":"F",
"Х":"H",
"Ц":"Ts",
"Ч":"Ch",
"Ш":"Sh",
"Щ":"Sht",
"Ъ": "A",
"Ю":"Yu",
"Я":"Ya",
"а":"a",
"б":"b",
"в":"v",
"г":"g",
"д":"d",
"е":"e",
"ж":"zh",
"з":"z",
"и":"i",
"й":"y",
"к":"k",
"л":"l",
"м":"m",
"н":"n",
"о":"o",
"п":"p",
"р":"r",
"с":"s",
"т":"t",
"у":"u",
"ф":"f",
"х":"h",
"ц":"ts",
"ч":"ch",
"ш":"sh",
"щ":"sht",
"ъ": "a",
"ь":"y",
"ю":"yu",
"я":"ya",]
Bulgarian Cyrillic to Latin
class CyrilicToLatinConverter {
public static func getLatin(wordInCyrillic: String) -> String{
if(wordInCyrillic.isEmpty) {return wordInCyrillic}
else{
let characters = Array(wordInCyrillic)
var wordInLatin: String = ""
for n in 0...characters.capacity-1 {
if isCyrillic(characters: characters[n]) {
wordInLatin+=cyrillicToLatinMap[characters[n]] ?? ""
}
else{
return ""
}
}
return wordInLatin
}
}
public static func isCyrillic(characters: Character) -> Bool {
var isCyrillic: Bool = true;
for (key,_) in cyrillicToLatinMap{
isCyrillic = (key == characters)
if isCyrillic {
break
}
}
return isCyrillic
}

Swift 3:
For Persian and Arabic
extension String {
var isFarsi: Bool {
//Remove extra spaces from the first and last word
let value = self.trimmingCharacters(in: CharacterSet.whitespacesAndNewlines)
if value == "" {
return false
}
let farsiLetters = "آ ا ب پ ت ث ج چ ح خ د ذ ر ز ژ س ش ص ض ط ظ ع غ ف ق ک گ ل م ی ن و ه"
let arabicLetters = " ء ا أ إ ء ؤ ئـ ئ آ اً ة ا ب ت ث ج ‌ ح خ د ذ ر ز س ‌ ش ص ض ط ظ ع غ ف ق ك ل م ن ه و ي"
for c in value.characters.map({ String($0) }) {
if !farsiLetters.contains(c) && !arabicLetters.contains(c) {
return false
}
}
return true
}
}

swift 5 solution
extension String {
var isLatin: Bool {
let upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
let lower = "abcdefghijklmnopqrstuvwxyz"
for c in self.map({String($0)}) where !upper.contains(c) && !lower.contains(c) {
return false
}
return true
}
}

Recreating Python's input statement in Swift

I was trying to recreate Python's input() statement in Swift, I have seen some examples, but I am trying to make it better, firstly, my version removes the \n part of the string, also, I was trying to make it firstly print a prompt, so that var example = input() would just wait for the message, (which it does), but then var example = input("Enter text: ") would print Enter text: and wait for text to be inputed.
The problem is, swift seems to be messing up the print's order. For example, being the code:
import Foundation
func input(inputStatement: String? = nil) -> String {
if let inputStatement = inputStatement {
print(inputStatement, terminator:"")
}
let keyboard = NSFileHandle.fileHandleWithStandardInput()
let inputData = keyboard.availableData
var strData = NSString(data: inputData, encoding: NSUTF8StringEncoding) as! String
strData = strData.stringByReplacingOccurrencesOfString("\n", withString: "")
print()
return strData
}
print("Creating the input statement in Swift!")
var test = input("What's your name: ")
print("You entered: \(test).")
And the input text, "hi", this prints:
Creating the input statement in Swift!
hi
What's your name: You entered: hi.
And what I expected was:
Creating the input statement in Swift!
What's your name: hi
You entered: hi.
What am I missing here?
Thanks

The problem is that the standard output file descriptor is line buffered
when writing to a terminal (and fully buffered otherwise).
Therefore the output of
print(inputStatement, terminator:"")
is buffered and not written before the
print()
writes a newline. You can fix that by flushing the file
descriptor explicitly:
if let inputStatement = inputStatement {
print(inputStatement, terminator:"")
fflush(stdout)
}
Note also that there is a
public func readLine(stripNewline stripNewline: Bool = default) -> String?
which reads a line from standard input, with the option to
remove the trailing newline character. This function also
flushes standard output. Therefore a simpler implementation would be
func input(prompt: String = "") -> String {
print(prompt, terminator: "")
guard let reply = readLine(stripNewline: true) else {
fatalError("Unexpected EOF on input")
}
return reply
}
(Of course you might choose to handle "end of file" differently.)

Replace part of string with lower case letters - Swift

I have a Swift based iOS app and one of the features allows you to comment on a post. Anyway, users can add "#mentions" in their posts to tag other people. However I want to stop the user from adding a username with a capital letter.
Is there anyway I can convert a string, so that the #usernames are all in lowercase?
For example:
I really enjoy sightseeing with #uSerABC (not allowed)
I really enjoy sightseeing with #userabc (allowed)
I know there is a property for the string in swift called .lowercaseString - but the problem with that, is that it makes the entire string lowercase and thats not what I want. I only want the #username to be in lower case.
Is there any way around this with having to use the .lowercase property.
Thanks for your time, Dan.

This comes from a code I use to detect hashtags, I've modified to detect mentions:
func detectMentionsInText(text: String) -> [NSRange]? {
let mentionsDetector = try? NSRegularExpression(pattern: "#(\\w+)", options: NSRegularExpressionOptions.CaseInsensitive)
let results = mentionsDetector?.matchesInString(text, options: NSMatchingOptions.WithoutAnchoringBounds, range: NSMakeRange(0, text.utf16.count)).map { $0 }
return results?.map{$0.rangeAtIndex(0)}
}
It detects all the mentions in a string by using a regex and returns an NSRange array, by using a range you have the beginning and the end of the "mention" and you can easily replace them with a lower case version.

Split the string into two using the following command -
let arr = myString.componentsSeparatedByString("#")
//Convert arr[1] to lower case
//Append to arr[0]
//Enjoy

Thanks to everyone for their help. In the end I couldn't get any of the solutions to work and after a lot of testing, I came up with this solution:
func correctStringWithUsernames(inputString: String, completion: (correctString: String) -> Void) {
// Create the final string and get all
// the seperate strings from the data.
var finalString: String!
var commentSegments: NSArray!
commentSegments = inputString.componentsSeparatedByString(" ")
if (commentSegments.count > 0) {
for (var loop = 0; loop < commentSegments.count; loop++) {
// Check the username to ensure that there
// are no capital letters in the string.
let currentString = commentSegments[loop] as! String
let capitalLetterRegEx = ".*[A-Z]+.*"
let textData = NSPredicate(format:"SELF MATCHES %#", capitalLetterRegEx)
let capitalResult = textData.evaluateWithObject(currentString)
// Check if the current loop string
// is a #user mention string or not.
if (currentString.containsString("#")) {
// If we are in the first loop then set the
// string otherwise concatenate the string.
if (loop == 0) {
if (capitalResult == true) {
// The username contains capital letters
// so change it to a lower case version.
finalString = currentString.lowercaseString
}
else {
// The username does not contain capital letters.
finalString = currentString
}
}
else {
if (capitalResult == true) {
// The username contains capital letters
// so change it to a lower case version.
finalString = "\(finalString) \(currentString.lowercaseString)"
}
else {
// The username does not contain capital letters.
finalString = "\(finalString) \(currentString)"
}
}
}
else {
// The current string is NOT a #user mention
// so simply set or concatenate the finalString.
if (loop == 0) {
finalString = currentString
}
else {
finalString = "\(finalString) \(currentString)"
}
}
}
}
else {
// No issues pass back the string.
finalString = inputString
}
// Pass back the correct username string.
completion(correctString: finalString)
}
Its certainly not the most elegant or efficient solution around but it does work. If there are any ways of improving it, please leave a comment.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to compose multi-character emoji from raw hex - ios

Related

Remove special characters from the string

String with Unicode (variable) [duplicate]

Check if string latin or cyrillic

Recreating Python's input statement in Swift

Replace part of string with lower case letters - Swift

Categories

Resources