Unicode Character Conversion

Unicode Character Conversion - ios

I am attempting to use the unicode character U+00AE in a UITextView. If I use the code \u{00AE} using the below:
textView.text = "THING AND STUFF TrademarkedThing\u{00AE}"
However, if I pull some text from another location (this is technically coming from an API call, but that shouldn't matter), and assign it to the textView, I do not get the unicode character:
var apiText = "TrademarkedThing\u{00AE}" //Pulled from API call as text and saved into text variable
textView.text = "THING AND STUFF " + apiText
So the in the code below, the first unicode character does not show, but the second does.
var apiText = "TrademarkedThing\u{00AE}" //Pulled from API call as text and saved into text variable
textView.text = "THING AND STUFF " + apiText + " \u{00AE}"
Why won't the unicode from that text show?

The Unicode conversion (from \u{...}) only happens for string literals. You can see the problem if you compare these two:
let t1 = "Thing\u{00AE}"
// Thing®
let t2 = "Thing\\u{00" + "AE}"
// Thing\u{00AE}
Since your string is coming from another source, it acts like the second one. The Swift language book has a section on Unicode characters in string literals.
If you want to interpret those Unicode sequences after the fact, here's a String extension:
extension String {
func indexOfSubstring(str: String, fromIndex: String.Index? = nil) -> String.Index? {
var index = fromIndex ?? startIndex
while index < endIndex {
if self[Range(start: index, end: endIndex)].hasPrefix(str) {
return index
}
index = index.successor()
}
return nil
}
func convertedUnicodeSequences() -> String {
if let index = indexOfSubstring("\\u{") {
if let nextIndex = indexOfSubstring("}", fromIndex: index) {
let substr = self[Range(start: advance(index, 3), end: nextIndex)]
let scalar = UnicodeScalar(UInt32(strtoul(substr, nil, 16)))
return self[Range(start: startIndex, end: index)] +
String(scalar) +
self[Range(start: nextIndex.successor(), end: endIndex)].convertedUnicodeSequences()
}
}
return self
}
}

Related

backspace not work in outside of regex in swift

I use this method for patterning the phone number in UITextField at the .editingChange event
But the delete key only removes the numbers
extension String{
func applyPatternOnNumbers(pattern: String) -> String {
let replacmentCharacter: Character = "#"
let pureNumber = self.replacingOccurrences( of: "[^۰-۹0-9]", with: "", options: .regularExpression)
var result = ""
var pureNumberIndex = pureNumber.startIndex
for patternCharacter in pattern {
if patternCharacter == replacmentCharacter {
guard pureNumberIndex < pureNumber.endIndex else { return result }
result.append(pureNumber[pureNumberIndex])
pureNumber.formIndex(after: &pureNumberIndex)
} else {
result.append(patternCharacter)
}
}
return result
}
}
use at the editingChange event
let pattern = "+# (###) ###-####"
let mobile = textField.text.substring(to: pattern.count-1)
textfield.text = mobile.applyPatternOnNumbers(pattern: pattern)
// print(textfield.text) +1 (800) 666-8888
the problem is space & - , ( , ) chars can not to be removed

The RegEx you are trying is to not consider digits only:
[^۰-۹0-9]
I'm not sure, but you may change it to:
[^۰-۹0-9\s-\(\)]
and it may work. You might just add a \ before your special chars inside [] and you can any other chars into it that you do not need to be replaced.
Or you may simplify it to
[^\d\s-\(\)]
and it might work.
Method 2
You may use this RegEx which is an exact match to the phone number format you are having:
\+\d+\s\(\d{3}\)\s\d{3}-\d{4}
You may remove the first +, if it is unnecessary
\d+\s\(\d{3}\)\s\d{3}-\d{4}

How to split string as English and non English using Swift 4?

I have a string which contains English and Arabic together. I am using an API, that is why I cannot set an indicator in it.
What I want to get is: the Arabic and English split into tow parts. Here is a sample String:
"بِاسْمِكَ رَبِّي وَضَعْتُ جَنْبِي، وَبِكَ أَرْفَعُهُ، فَإِنْ أَمْسَكْتَ نَفْسِي فَارْحَمْهَا، وَإِنْ أَرْسَلْتَهَا فَاحْفَظْهَا، بِمَا تَحْفَظُ بِهِ عِبَادَكَ الصَّالِحِينَ.Bismika rabbee wadaAAtu janbee wabika arfaAAuh, fa-in amsakta nafsee farhamha, wa-in arsaltaha fahfathha bima tahfathu bihi AAibadakas-saliheen. In Your name my Lord, I lie down and in Your name I rise, so if You should take my soul then have mercy upon it, and if You should return my soul then protect it in the manner You do so with Your righteous servants.",
I cannot find how to split it into 2 parts that I get Arabic and English into two different parts.
What I want:
so there can be any language, my problem is to only take out English or Arabic language and show them in respective fields.
How can I achieve it?

You can use a Natural Language Tagger, which would work even if both scripts are intermingled:
import NaturalLanguage
let str = "¿como? بداية start وسط middle начать средний конец نهاية end. 從中間開始. "
let tagger = NLTagger(tagSchemes: [.script])
tagger.string = str
var index = str.startIndex
var dictionary = [String: String]()
var lastScript = "other"
while index < str.endIndex {
let res = tagger.tag(at: index, unit: .word, scheme: .script)
let range = res.1
let script = res.0?.rawValue
switch script {
case .some(let s):
lastScript = s
dictionary[s, default: ""] += dictionary["other", default: ""] + str[range]
dictionary.removeValue(forKey: "other")
default:
dictionary[lastScript, default: ""] += str[range]
}
index = range.upperBound
}
print(dictionary)
and print the result if you'd like:
for entry in dictionary {
print(entry.key, ":", entry.value)
}
yielding :
Hant : 從中間開始.
Cyrl : начать средний конец
Arab : بداية وسط نهاية
Latn : ¿como? start middle end.
This is still not perfect since the language tagger only checks to which script the most number of letters in a word belong to. For example, in the string you're working with, the tagger would consider الصَّالِحِينَ.Bismika as one word. To overcome this, we could use two pointers and traverse the original string and check the script of words individually. Words are defined as contiguous letters:
let str = "بِاسْمِكَ رَبِّي وَضَعْتُ جَنْبِي، وَبِكَ أَرْفَعُهُ، فَإِنْ أَمْسَكْتَ نَفْسِي فَارْحَمْهَا، وَإِنْ أَرْسَلْتَهَا فَاحْفَظْهَا، بِمَا تَحْفَظُ بِهِ عِبَادَكَ الصَّالِحِينَ.Bismika rabbee wadaAAtu janbee wabika arfaAAuh, fa-in amsakta nafsee farhamha, wa-in arsaltaha fahfathha bima tahfathu bihi AAibadakas-saliheen. In Your name my Lord, I lie down and in Your name I rise, so if You should take my soul then have mercy upon it, and if You should return my soul then protect it in the manner You do so with Your righteous servants."
let tagger = NLTagger(tagSchemes: [.script])
var i = str.startIndex
var dictionary = [String: String]()
var lastScript = "glyphs"
while i < str.endIndex {
var j = i
while j < str.endIndex,
CharacterSet.letters.inverted.isSuperset(of: CharacterSet(charactersIn: String(str[j]))) {
j = str.index(after: j)
}
if i != j { dictionary[lastScript, default: ""] += str[i..<j] }
if j < str.endIndex { i = j } else { break }
while j < str.endIndex,
CharacterSet.letters.isSuperset(of: CharacterSet(charactersIn: String(str[j]))) {
j = str.index(after: j)
}
let tempo = String(str[i..<j])
tagger.string = tempo
let res = tagger.tag(at: tempo.startIndex, unit: .word, scheme: .script)
if let s = res.0?.rawValue {
lastScript = s
dictionary[s, default: ""] += dictionary["glyphs", default: ""] + tempo
dictionary.removeValue(forKey: "glyphs")
}
else { dictionary["other", default: ""] += tempo }
i = j
}

You can use the NaturalLanguageTagger as answered by #ielyamani but the only limitation is that it is iOS 12+
If you are trying to do this on earlier iOS versions, you can take a look at NSCharacterSet
You can create your own characterset to check whether a string has english characters and numbers
extension String {
func containsLatinCharacters() -> Bool {
var charSet = NSCharacterSet(charactersInString: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890")
charSet = charSet.invertedSet
let range = (self as NSString).rangeOfCharacterFromSet(charSet)
if range.location != NSNotFound {
return false
}
return true
}
}
Another option is to use the charactersets already available:
let nonLatinString = string.trimmingCharacters(in: .alphanumerics)//symbols will still get through
let latinString = string.trimmingCharacters(in: CharacterSet.alphanumerics.inverted)//symbols and non-latin characters wont get through
With these you can get the strings you want quite easily. But if these are not good enough, you can look to create your own characterset, use union, intersect etc to filter out the wanted and the unwanted characters.

Step 1:
You have to split whole string into an array by "." as I can see there are "." between sentence.
Step 2:
Pass each sentence to determine its language and append into different string.
Final Code
//add in your viewController
enum Language : String {
case arabic = "ar"
case english = "en"
}
override func viewDidLoad() {
super.viewDidLoad()
//make array of string
let kalmaArray = "بِاسْمِكَ رَبِّي وَضَعْتُ جَنْبِي، وَبِكَ أَرْفَعُهُ، فَإِنْ أَمْسَكْتَ نَفْسِي فَارْحَمْهَا، وَإِنْ أَرْسَلْتَهَا فَاحْفَظْهَا، بِمَا تَحْفَظُ بِهِ عِبَادَكَ الصَّالِحِينَ.Bismika rabbee wadaAAtu janbee wabika arfaAAuh, fa-in amsakta nafsee farhamha, wa-in arsaltaha fahfathha bima tahfathu bihi AAibadakas-saliheen. In Your name my Lord, I lie down and in Your name I rise, so if You should take my soul then have mercy upon it, and if You should return my soul then protect it in the manner You do so with Your righteous servants.".components(separatedBy: ".")
splitInLanguages(kalmaArray: kalmaArray)
}
private func splitInLanguages(kalmaArray: [String]){
var englishText = ""
var arabicText = ""
for kalma in kalmaArray {
if kalma.count > 0 {
if let language = NSLinguisticTagger.dominantLanguage(for: kalma) {
switch language {
case Language.arabic.rawValue:
arabicText.append(kalma)
arabicText.append(".")
break
default: // English
englishText.append(kalma)
englishText.append(".")
break
}
} else {
print("Unknown language")
}
}
}
debugPrint("Arabic: ", arabicText)
debugPrint("English: ", englishText)
}
I hope it will help you to split the string in two language. Let me know if you are still having any issue.

Remove special characters from the string

I am trying to use an iOS app to dial a number. The problem is that the number is in the following format:
po placeAnnotation.mapItem.phoneNumber!
"‎+1 (832) 831-6486"
I want to get rid of some special characters and I want the following:
832-831-6486
I used the following code but it did not remove anything:
let charactersToRemove = CharacterSet(charactersIn: "()+-")
var telephone = placeAnnotation.mapItem.phoneNumber?.trimmingCharacters(in: charactersToRemove)
Any ideas?

placeAnnotation.mapItem.phoneNumber!.components(separatedBy: CharacterSet.decimalDigits.inverted)
.joined()
Here you go!
I tested and works well.

If you want something similar to CharacterSet with some flexibility, this should work:
let phoneNumber = "1 (832) 831-6486"
let charsToRemove: Set<Character> = Set("()+-".characters)
let newNumberCharacters = String(phoneNumber.characters.filter { !charsToRemove.contains($0) })
print(newNumberCharacters) //prints 1 832 8316486

I know the question is already answered, but to format phone numbers in any way one could use a custom formatter like below
class PhoneNumberFormatter:Formatter
{
var numberFormat:String = "(###) ### ####"
override func string(for obj: Any?) -> String? {
if let number = obj as? NSNumber
{
var input = number as Int64
var output = numberFormat
while output.characters.contains("#")
{
if let range = output.range(of: "#", options: .backwards)
{
output = output.replacingCharacters(in: range, with: "\(input % 10)")
input /= 10
}
else
{
output.replacingOccurrences(of: "#", with: "")
}
}
return output
}
return nil
}
func string(from number:NSNumber) -> String?
{
return string(for: number)
}
}
let phoneNumberFormatter = PhoneNumberFormatter()
//Digits will be filled backwards in place of hashes. It is easy change the custom formatter in anyway
phoneNumberFormatter.numberFormat = "###-##-##-##-##"
phoneNumberFormatter.string(from: 18063783889)

Swift 3
func removeSpecialCharsFromString(_ str: String) -> String {
struct Constants {
static let validChars = Set("1234567890-".characters)
}
return String(str.characters.filter { Constants.validChars.contains($0) })
}
To Use
let str : String = "+1 (832) 831-6486"
let newStr : String = self.removeSpecialCharsFromString(str)
print(newStr)
Note: you can add validChars which you want in string after operation perform.

If you have the number and special character in String format the use following code to remove special character
let numberWithSpecialChar = "1800-180-0000"
let actulNumber = numberWithSpecialChar.components(separatedBy: CharcterSet.decimalDigit.inverted).joined()
Otherwise, If you have the characters and special character in String format the use following code to remove special character
let charactersWithSpecialChar = "A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal-Panama!"
let actulString = charactersWithSpecialChar.components(separatedBy: CharacterSet.letters.inverted).joined(separator: " ")

NSString *str = #"(123)-456-7890";
NSLog(#"String: %#", str);
// Create character set with specified characters
NSMutableCharacterSet *characterSet =
[NSMutableCharacterSet characterSetWithCharactersInString:#"()-"];
// Build array of components using specified characters as separtors
NSArray *arrayOfComponents = [str componentsSeparatedByCharactersInSet:characterSet];
// Create string from the array components
NSString *strOutput = [arrayOfComponents componentsJoinedByString:#""];
NSLog(#"New string: %#", strOutput);

Replace part of string with lower case letters - Swift

I have a Swift based iOS app and one of the features allows you to comment on a post. Anyway, users can add "#mentions" in their posts to tag other people. However I want to stop the user from adding a username with a capital letter.
Is there anyway I can convert a string, so that the #usernames are all in lowercase?
For example:
I really enjoy sightseeing with #uSerABC (not allowed)
I really enjoy sightseeing with #userabc (allowed)
I know there is a property for the string in swift called .lowercaseString - but the problem with that, is that it makes the entire string lowercase and thats not what I want. I only want the #username to be in lower case.
Is there any way around this with having to use the .lowercase property.
Thanks for your time, Dan.

This comes from a code I use to detect hashtags, I've modified to detect mentions:
func detectMentionsInText(text: String) -> [NSRange]? {
let mentionsDetector = try? NSRegularExpression(pattern: "#(\\w+)", options: NSRegularExpressionOptions.CaseInsensitive)
let results = mentionsDetector?.matchesInString(text, options: NSMatchingOptions.WithoutAnchoringBounds, range: NSMakeRange(0, text.utf16.count)).map { $0 }
return results?.map{$0.rangeAtIndex(0)}
}
It detects all the mentions in a string by using a regex and returns an NSRange array, by using a range you have the beginning and the end of the "mention" and you can easily replace them with a lower case version.

Split the string into two using the following command -
let arr = myString.componentsSeparatedByString("#")
//Convert arr[1] to lower case
//Append to arr[0]
//Enjoy

Thanks to everyone for their help. In the end I couldn't get any of the solutions to work and after a lot of testing, I came up with this solution:
func correctStringWithUsernames(inputString: String, completion: (correctString: String) -> Void) {
// Create the final string and get all
// the seperate strings from the data.
var finalString: String!
var commentSegments: NSArray!
commentSegments = inputString.componentsSeparatedByString(" ")
if (commentSegments.count > 0) {
for (var loop = 0; loop < commentSegments.count; loop++) {
// Check the username to ensure that there
// are no capital letters in the string.
let currentString = commentSegments[loop] as! String
let capitalLetterRegEx = ".*[A-Z]+.*"
let textData = NSPredicate(format:"SELF MATCHES %#", capitalLetterRegEx)
let capitalResult = textData.evaluateWithObject(currentString)
// Check if the current loop string
// is a #user mention string or not.
if (currentString.containsString("#")) {
// If we are in the first loop then set the
// string otherwise concatenate the string.
if (loop == 0) {
if (capitalResult == true) {
// The username contains capital letters
// so change it to a lower case version.
finalString = currentString.lowercaseString
}
else {
// The username does not contain capital letters.
finalString = currentString
}
}
else {
if (capitalResult == true) {
// The username contains capital letters
// so change it to a lower case version.
finalString = "\(finalString) \(currentString.lowercaseString)"
}
else {
// The username does not contain capital letters.
finalString = "\(finalString) \(currentString)"
}
}
}
else {
// The current string is NOT a #user mention
// so simply set or concatenate the finalString.
if (loop == 0) {
finalString = currentString
}
else {
finalString = "\(finalString) \(currentString)"
}
}
}
}
else {
// No issues pass back the string.
finalString = inputString
}
// Pass back the correct username string.
completion(correctString: finalString)
}
Its certainly not the most elegant or efficient solution around but it does work. If there are any ways of improving it, please leave a comment.

Using character delimiters to find and highlight text in Swift

I previously developed an android app that served as a reference guide to users. It used a sqlite database to store the information. The database stores UTF-8 text without formatting (i.e. bold or underlined)
To highlight what sections of text required formatting I enclosed them using delimiter tokens specifically $$ as this does not appear in the database as information. Before displaying the text to the user I wrote a method to find these delimiters and add formatting to the text contained within them and delete the delimiters. so $$foo$$ became foo.
My java code for this is as follows:
private static CharSequence boldUnderlineText(CharSequence text, String token) {
int tokenLen = token.length();
int start = text.toString().indexOf(token) + tokenLen;
int end = text.toString().indexOf(token, start);
while (start > -1 && end > -1)
{
SpannableStringBuilder spannableStringBuilder = new SpannableStringBuilder(text);
//add the formatting required
spannableStringBuilder.setSpan(new UnderlineSpan(), start, end, 0);
spannableStringBuilder.setSpan(new StyleSpan(Typeface.BOLD), start, end, 0);
// Delete the tokens before and after the span
spannableStringBuilder.delete(end, end + tokenLen);
spannableStringBuilder.delete(start - tokenLen, start);
text = spannableStringBuilder;
start = text.toString().indexOf(token, end - tokenLen - tokenLen) + tokenLen;
end = text.toString().indexOf(token, start);
}
return text;
}
I have recreated my app in Swift for iOS and it is complete apart from showing the correct formatting. It appears that Swift treats strings differently from other languages.
So far I have tried using both NSString and String types for my original unformatted paragraph and get manage to get the range, start and end index of the first delimiter:
func applyFormatting2(noFormatString: NSString, delimiter: String){
let paragraphLength: Int = noFormatString.length //length of paragraph
let tokenLength: Int = delimiter.characters.count //length of token
let rangeOfToken = noFormatString.rangeOfString(formatToken) //range of the first delimiter
let startOfToken = rangeOfToken.toRange()?.startIndex //start index of first delimiter
let endOfToken = rangeOfToken.toRange()?.endIndex //end index of first delimiter
var startOfFormatting = endOfToken //where to start the edit (end index of first delimiter)
}
OR
func applyFormatting(noFormatString: String, token: String){
let paragraphLength: Int = noFormatString.characters.count
let tokenLength: Int = token.characters.count //length of the $$ Token (2)
let rangeOfToken = noFormatString.rangeOfString(formatToken) //The range of the first instance of $$ in the no format string
let startOfToken = rangeOfToken?.startIndex //the starting index of the found range for the found instance of $$
let endOfToken = rangeOfToken?.endIndex //the starting index of the found range for the found instance of $$
var startOfFormatting = endOfToken
}
I appreciate this code is verbose and has pointless variables but it helps me think though my code when I'm working out a problem.
I am currently struggling to workout how to find the second/closing delimiter. I want to search through the string from a specific index as I did in Java using the line
int end = text.toString().indexOf(token, start);
however I cannot work out how to do this using ranges.
Can anyone help me out with either how to correctly identify where the closing delimiter is or how to complete the code block to format all the required text?
Thanks
Aldo

How about using NSRegularExpression?
public extension NSMutableAttributedString {
func addAttributes(attrs: [String : AnyObject], delimiter: String) throws {
let escaped = NSRegularExpression.escapedPatternForString(delimiter)
let regex = try NSRegularExpression(pattern:"\(escaped)(.*?)\(escaped)", options: [])
var offset = 0
regex.enumerateMatchesInString(string, options: [], range: NSRange(location: 0, length: string.characters.count)) { (result, flags, stop) -> Void in
guard let result = result else {
return
}
let range = NSRange(location: result.range.location + offset, length: result.range.length)
self.addAttributes(attrs, range: range)
let replacement = regex.replacementStringForResult(result, inString: self.string, offset: offset, template: "$1")
self.replaceCharactersInRange(range, withString: replacement)
offset -= (2 * delimiter.characters.count)
}
}
}
Here is how you call it.
let string = NSMutableAttributedString(string:"Here is some $$bold$$ text that should be $$emphasized$$")
let attributes = [NSFontAttributeName: UIFont.boldSystemFontOfSize(15)]
try! string.addAttributes(attributes, delimiter: "$$")

The iOS way of doing this is with NS[Mutable]AttributedStrings. You set dictionaries of attributes on text ranges. These attributes include font weights, sizes, colors, line spacing, etc.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Unicode Character Conversion - ios

Related

backspace not work in outside of regex in swift

How to split string as English and non English using Swift 4?

Remove special characters from the string

Replace part of string with lower case letters - Swift

Using character delimiters to find and highlight text in Swift

Categories

Resources