How to split string as English and non English using Swift 4? - ios

I have a string which contains English and Arabic together. I am using an API, that is why I cannot set an indicator in it.
What I want to get is: the Arabic and English split into tow parts. Here is a sample String:
"بِاسْمِكَ رَبِّي وَضَعْتُ جَنْبِي، وَبِكَ أَرْفَعُهُ، فَإِنْ أَمْسَكْتَ نَفْسِي فَارْحَمْهَا، وَإِنْ أَرْسَلْتَهَا فَاحْفَظْهَا، بِمَا تَحْفَظُ بِهِ عِبَادَكَ الصَّالِحِينَ.Bismika rabbee wadaAAtu janbee wabika arfaAAuh, fa-in amsakta nafsee farhamha, wa-in arsaltaha fahfathha bima tahfathu bihi AAibadakas-saliheen. In Your name my Lord, I lie down and in Your name I rise, so if You should take my soul then have mercy upon it, and if You should return my soul then protect it in the manner You do so with Your righteous servants.",
I cannot find how to split it into 2 parts that I get Arabic and English into two different parts.
What I want:
so there can be any language, my problem is to only take out English or Arabic language and show them in respective fields.
How can I achieve it?

You can use a Natural Language Tagger, which would work even if both scripts are intermingled:
import NaturalLanguage
let str = "¿como? بداية start وسط middle начать средний конец نهاية end. 從中間開始. "
let tagger = NLTagger(tagSchemes: [.script])
tagger.string = str
var index = str.startIndex
var dictionary = [String: String]()
var lastScript = "other"
while index < str.endIndex {
let res = tagger.tag(at: index, unit: .word, scheme: .script)
let range = res.1
let script = res.0?.rawValue
switch script {
case .some(let s):
lastScript = s
dictionary[s, default: ""] += dictionary["other", default: ""] + str[range]
dictionary.removeValue(forKey: "other")
default:
dictionary[lastScript, default: ""] += str[range]
}
index = range.upperBound
}
print(dictionary)
and print the result if you'd like:
for entry in dictionary {
print(entry.key, ":", entry.value)
}
yielding :
Hant : 從中間開始.
Cyrl : начать средний конец
Arab : بداية وسط نهاية
Latn : ¿como? start middle end.
This is still not perfect since the language tagger only checks to which script the most number of letters in a word belong to. For example, in the string you're working with, the tagger would consider الصَّالِحِينَ.Bismika as one word. To overcome this, we could use two pointers and traverse the original string and check the script of words individually. Words are defined as contiguous letters:
let str = "بِاسْمِكَ رَبِّي وَضَعْتُ جَنْبِي، وَبِكَ أَرْفَعُهُ، فَإِنْ أَمْسَكْتَ نَفْسِي فَارْحَمْهَا، وَإِنْ أَرْسَلْتَهَا فَاحْفَظْهَا، بِمَا تَحْفَظُ بِهِ عِبَادَكَ الصَّالِحِينَ.Bismika rabbee wadaAAtu janbee wabika arfaAAuh, fa-in amsakta nafsee farhamha, wa-in arsaltaha fahfathha bima tahfathu bihi AAibadakas-saliheen. In Your name my Lord, I lie down and in Your name I rise, so if You should take my soul then have mercy upon it, and if You should return my soul then protect it in the manner You do so with Your righteous servants."
let tagger = NLTagger(tagSchemes: [.script])
var i = str.startIndex
var dictionary = [String: String]()
var lastScript = "glyphs"
while i < str.endIndex {
var j = i
while j < str.endIndex,
CharacterSet.letters.inverted.isSuperset(of: CharacterSet(charactersIn: String(str[j]))) {
j = str.index(after: j)
}
if i != j { dictionary[lastScript, default: ""] += str[i..<j] }
if j < str.endIndex { i = j } else { break }
while j < str.endIndex,
CharacterSet.letters.isSuperset(of: CharacterSet(charactersIn: String(str[j]))) {
j = str.index(after: j)
}
let tempo = String(str[i..<j])
tagger.string = tempo
let res = tagger.tag(at: tempo.startIndex, unit: .word, scheme: .script)
if let s = res.0?.rawValue {
lastScript = s
dictionary[s, default: ""] += dictionary["glyphs", default: ""] + tempo
dictionary.removeValue(forKey: "glyphs")
}
else { dictionary["other", default: ""] += tempo }
i = j
}

You can use the NaturalLanguageTagger as answered by #ielyamani but the only limitation is that it is iOS 12+
If you are trying to do this on earlier iOS versions, you can take a look at NSCharacterSet
You can create your own characterset to check whether a string has english characters and numbers
extension String {
func containsLatinCharacters() -> Bool {
var charSet = NSCharacterSet(charactersInString: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890")
charSet = charSet.invertedSet
let range = (self as NSString).rangeOfCharacterFromSet(charSet)
if range.location != NSNotFound {
return false
}
return true
}
}
Another option is to use the charactersets already available:
let nonLatinString = string.trimmingCharacters(in: .alphanumerics)//symbols will still get through
let latinString = string.trimmingCharacters(in: CharacterSet.alphanumerics.inverted)//symbols and non-latin characters wont get through
With these you can get the strings you want quite easily. But if these are not good enough, you can look to create your own characterset, use union, intersect etc to filter out the wanted and the unwanted characters.

Step 1:
You have to split whole string into an array by "." as I can see there are "." between sentence.
Step 2:
Pass each sentence to determine its language and append into different string.
Final Code
//add in your viewController
enum Language : String {
case arabic = "ar"
case english = "en"
}
override func viewDidLoad() {
super.viewDidLoad()
//make array of string
let kalmaArray = "بِاسْمِكَ رَبِّي وَضَعْتُ جَنْبِي، وَبِكَ أَرْفَعُهُ، فَإِنْ أَمْسَكْتَ نَفْسِي فَارْحَمْهَا، وَإِنْ أَرْسَلْتَهَا فَاحْفَظْهَا، بِمَا تَحْفَظُ بِهِ عِبَادَكَ الصَّالِحِينَ.Bismika rabbee wadaAAtu janbee wabika arfaAAuh, fa-in amsakta nafsee farhamha, wa-in arsaltaha fahfathha bima tahfathu bihi AAibadakas-saliheen. In Your name my Lord, I lie down and in Your name I rise, so if You should take my soul then have mercy upon it, and if You should return my soul then protect it in the manner You do so with Your righteous servants.".components(separatedBy: ".")
splitInLanguages(kalmaArray: kalmaArray)
}
private func splitInLanguages(kalmaArray: [String]){
var englishText = ""
var arabicText = ""
for kalma in kalmaArray {
if kalma.count > 0 {
if let language = NSLinguisticTagger.dominantLanguage(for: kalma) {
switch language {
case Language.arabic.rawValue:
arabicText.append(kalma)
arabicText.append(".")
break
default: // English
englishText.append(kalma)
englishText.append(".")
break
}
} else {
print("Unknown language")
}
}
}
debugPrint("Arabic: ", arabicText)
debugPrint("English: ", englishText)
}
I hope it will help you to split the string in two language. Let me know if you are still having any issue.

Related

How to Check if String begins with Alphabet Letter in Swift 5?

Problem: i am currently trying to Sort a List in SwiftUI according to the Items First Character. I also would like to implement a Section for all Items, which doesn't begin with a Character of the Alphabet (Numbers, Special Chars).
My Code so far:
let nonAlphabetItems = items.filter { $0.name.uppercased() != /* beginns with A - Z */ }
Does anyone has a Solution for this Issue. Of course I could do a huge Loop Construct, however I hope there is a more elegant way.
Thanks for your help.
You can check if a string range "A"..."Z" contains the first letter of your name property:
struct Item {
let name: String
}
let items: [Item] = [.init(name: "Def"),.init(name: "Ghi"),.init(name: "123"),.init(name: "Abc")]
let nonAlphabetItems = items.filter { !("A"..."Z" ~= ($0.name.first?.uppercased() ?? "#")) }
nonAlphabetItems // [{name "123"}]
Expanding on this topic we can extend Character to add a isAsciiLetter property:
extension Character {
var isAsciiLetter: Bool { "A"..."Z" ~= self || "a"..."z" ~= self }
}
This would allow to extend StringProtocol to check is a string starts with an ascii letter:
extension StringProtocol {
var startsWithAsciiLetter: Bool { first?.isAsciiLetter == true }
}
And just a helper to negate a boolean property:
extension Bool {
var negated: Bool { !self }
}
Now we can filter the items collection as follow:
let nonAlphabetItems = items.filter(\.name.startsWithAsciiLetter.negated) // [{name "123"}]
If you need an occasional filter, you could simply write a condition combining standard predicates isLetter and isASCII which are already defined for Character. It's as simple as:
let items = [ "Abc", "01bc", "Ça va", "", " ", "𓀫𓀫𓀫𓀫"]
let nonAlphabetItems = items.filter { $0.isEmpty || !$0.first!.isASCII || !$0.first!.isLetter }
print (nonAlphabetItems) // -> Output: ["01bc", "Ça va", "", " ", "𓀫𓀫𓀫𓀫"]
If the string is not empty, it has for sure a first character $0.first!. It is tempting to use isLetter , but it appears to be true for many characters in many local alphabets, including for example the antique Egyptian hieroglyphs like "𓀫" or the French alphabet with "Ç"and accented characters. This is why you need to restrict it to ASCII letters, to limit yourself to the roman alphabet.
You can use NSCharacterSet in the following way :
let phrase = "Test case"
let range = phrase.rangeOfCharacter(from: characterSet)
// range will be nil if no letters is found
if let test = range {
println("letters found")
}
else {
println("letters not found")
}```
You can deal with ascii value
extension String {
var fisrtCharacterIsAlphabet: Bool {
guard let firstChar = self.first else { return false }
let unicode = String(firstChar).unicodeScalars
let ascii = Int(unicode[unicode.startIndex].value)
return (ascii >= 65 && ascii <= 90) || (ascii >= 97 && ascii <= 122)
}
}
var isAlphabet = "Hello".fisrtCharacterIsAlphabet
The Character type has a property for this:
let x: Character = "x"
x.isLetter // true for letters, false for punctuation, numbers, whitespace, ...
Note that this will include characters from other alphabets (Greek, Cyrillic, Chinese, ...).
As String is a Sequence with Element equal to Character, we can use the .first property to get the first char.
With this, you can filter your items:
let filtered = items.filter { $0.name.first?.isLetter ?? false }
You can get this done through this simple String extension
extension StringProtocol {
var isFirstCharacterAlp: Bool {
first?.isASCII == true && first?.isLetter == true
}
}
Usage:
print ("H1".isFirstCharacterAlp)
print ("ابراهيم1".isFirstCharacterAlp)
Output
true
false
Happy Coding!
Reference

backspace not work in outside of regex in swift

I use this method for patterning the phone number in UITextField at the .editingChange event
But the delete key only removes the numbers
extension String{
func applyPatternOnNumbers(pattern: String) -> String {
let replacmentCharacter: Character = "#"
let pureNumber = self.replacingOccurrences( of: "[^۰-۹0-9]", with: "", options: .regularExpression)
var result = ""
var pureNumberIndex = pureNumber.startIndex
for patternCharacter in pattern {
if patternCharacter == replacmentCharacter {
guard pureNumberIndex < pureNumber.endIndex else { return result }
result.append(pureNumber[pureNumberIndex])
pureNumber.formIndex(after: &pureNumberIndex)
} else {
result.append(patternCharacter)
}
}
return result
}
}
use at the editingChange event
let pattern = "+# (###) ###-####"
let mobile = textField.text.substring(to: pattern.count-1)
textfield.text = mobile.applyPatternOnNumbers(pattern: pattern)
// print(textfield.text) +1 (800) 666-8888
the problem is space & - , ( , ) chars can not to be removed
The RegEx you are trying is to not consider digits only:
[^۰-۹0-9]
I'm not sure, but you may change it to:
[^۰-۹0-9\s-\(\)]
and it may work. You might just add a \ before your special chars inside [] and you can any other chars into it that you do not need to be replaced.
Or you may simplify it to
[^\d\s-\(\)]
and it might work.
Method 2
You may use this RegEx which is an exact match to the phone number format you are having:
\+\d+\s\(\d{3}\)\s\d{3}-\d{4}
You may remove the first +, if it is unnecessary
\d+\s\(\d{3}\)\s\d{3}-\d{4}

Comparing quantities of unique elements contained in large data sets [duplicate]

This question already has answers here:
Refactored Solution In Swift
(2 answers)
Closed 6 years ago.
I'm attempting to solve HackerRank's Hash Table Ransom Note challenge. There are 19 test cases and I'm passing all but two of time due to timeout on larger data sets (10,000-30,000 entries).
I'm given:
1) an array of words contained in a magazine and
2) an array of words for a ransom note. My objective is to determine if the words in the magazine can be used to construct a ransom note.
I need to have enough unique elements in the magazineWords to satisfy the quantity needed by noteWords.
I'm using this code to make that determination...and it takes FOREVER...
for word in noteWordsSet {
// check if there are enough unique words in magazineWords to put in the note
if magazineWords.filter({$0==word}).count < noteWords.filter({$0==word}).count {
return "No"
}
}
What is a faster way to accomplish this task?
Below is my complete code for the challenge:
import Foundation
var magazineWords = // Array of 1 to 30,000 strings
var noteWords = // Array of 1 to 30,000 strings
enum RegexString: String {
// Letters a to z, A to Z, 1 to 5 characters long
case wordCanBeUsed = "([a-zA-Z]{1,5})"
}
func matches(for regexString: String, in text: String) -> [String] {
// Hat tip MartinR for this
do {
let regex = try NSRegularExpression(pattern: regexString)
let nsString = text as NSString
let results = regex.matches(in: text, range: NSRange(location: 0, length: nsString.length))
return results.map { nsString.substring(with: $0.range)}
} catch let error {
print("invalid regex: \(error.localizedDescription)")
return []
}
}
func canCreateRansomNote(from magazineWords: [String], for noteWords: [String]) -> String {
// figure out what's unique
let magazineWordsSet = Set(magazineWords)
let noteWordsSet = Set(noteWords)
let intersectingValuesSet = magazineWordsSet.intersection(noteWordsSet)
// constraints specified in challenge
guard magazineWords.count >= 1, noteWords.count >= 1 else { return "No" }
guard magazineWords.count <= 30000, noteWords.count <= 30000 else { return "No" }
// make sure there are enough individual words to work with
guard magazineWordsSet.count >= noteWordsSet.count else { return "No" }
guard intersectingValuesSet.count == noteWordsSet.count else { return "No" }
// check if all the words can be used. assume the regex method works perfectly
guard noteWords.count == matches(for: RegexString.wordCanBeUsed.rawValue, in: noteWords.joined(separator: " ")).count else { return "No" }
// FIXME: this is a processor hog. I'm timing out when I get to this point
// need to make sure there are enough magazine words to write the note
// compare quantity of word in magazine with quantity of word in note
for word in noteWordsSet {
// check if there are enough unique words in magazineWords to put in the note
if magazineWords.filter({$0==word}).count < noteWords.filter({$0==word}).count {
return "No"
}
}
return "Yes"
}
print(canCreateRansomNote(from: magazineWords, for: noteWords))
I don't know how to read from the test case on the contest website or what frameworks you are allowed. If Foundation is allowed, you can use NSCountedSet
import Foundation
let fileContent = try! String(contentsOf: URL(fileURLWithPath: "/path/to/file.txt"))
let scanner = Scanner(string: fileContent)
var m = 0
var n = 0
scanner.scanInt(&m)
scanner.scanInt(&n)
var magazineWords = NSCountedSet(capacity: m)
var ransomWords = NSCountedSet(capacity: n)
for i in 0..<(m+n) {
var word: NSString? = nil
scanner.scanUpToCharacters(from: .whitespacesAndNewlines, into: &word)
if i < m {
magazineWords.add(word!)
} else {
ransomWords.add(word!)
}
}
var canCreate = true
for w in ransomWords {
if ransomWords.count(for: w) > magazineWords.count(for: w) {
canCreate = false
break
}
}
print(canCreate ? "Yes" : "No")
It works by going through the input file one word at a time, counting how many times that word appears in the magazine and in the ransom note. Then if any word appear more frequently in the ransom note than in the magazine, it fails the test immediately. Run the 30,000 words test case in less than 1 second on my iMac 2012.

Check if string latin or cyrillic

Is it some way to check if some string latin or cyrillic? I've tried localizedCompare String method, but it don't gave me needed result.
What about something like this?
extension String {
var isLatin: Bool {
let upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
let lower = "abcdefghijklmnopqrstuvwxyz"
for c in self.characters.map({ String($0) }) {
if !upper.containsString(c) && !lower.containsString(c) {
return false
}
}
return true
}
var isCyrillic: Bool {
let upper = "АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЮЯ"
let lower = "абвгдежзийклмнопрстуфхцчшщьюя"
for c in self.characters.map({ String($0) }) {
if !upper.containsString(c) && !lower.containsString(c) {
return false
}
}
return true
}
var isBothLatinAndCyrillic: Bool {
return self.isLatin && self.isCyrillic
}
}
Usage:
let s = "Hello"
if s.isLatin && !s.isBothLatinAndCyrillic {
// String is latin
} else if s.isCyrillic && !s.isBothLatinAndCyrillic {
// String is cyrillic
} else if s.isBothLatinAndCyrillic {
// String can be either latin or cyrillic
} else {
// String is not latin nor cyrillic
}
Considere there are cases where the given string could be both, for example the string:
let s = "A"
Can be both latin or cyrillic. So that's why there's the function "is both".
And it can also be none of them:
let s = "*"
You should get all unicode characters and detect if contains cyrillic chars or Latin char based on the unicode value. This code is not complet, you can complete it.
let a : String = "ӿ" //unicode value = 04FF
let scalars = a.unicodeScalars
//get unicode value of first char:
let unicodeValue = scalars[scalars.startIndex].value //print 1279, correspondant to 04FF.
Check here for all unicode value (in hexa).
http://jrgraphix.net/r/Unicode/0400-04FF
According to this site, cyrillic value are from 0400 -> 04FF (1024 -> 1279)
this is the code for cyrillic check:
var isCyrillic = true
for (index, unicode) in scalars.enumerate() {
if (unicode.value < 1024 || unicode.value > 1279) {
print("not a cyrillic text")
print(unicode.value)
isCyrillic = false
break
}
}
Surprisingly, there's no easy answer to your question. The Latin alphabet contains more than just A - Z. There are accented characters in French and archaic forms in German, etc. I don't know the Cyrillic alphabet so I'll leave it alone. On top of that, you have to deal with: punctuation (.,?"(), etc.) and white space, emojis, arrows, dingbats... which are language neutral. The complexity can escalate very quickly depending on your requirements.
The answer you accepted is inadequate to say the least: "hello world".isLatin == false since it doesn't deal with white spaces.
Visit a site like this one to learn what ranges contain characters for which language and play with the code below. It's not a complete answer but meant to get you started:
let neutralRanges = [0x20...0x40]
let latinRanges = [0x41...0x5A, 0x61...0x7A, 0xC0...0xFF, 0x100...0x17F]
let cyrillicRanges = [0x400...0x4FF, 0x500...0x52F]
func scalar(scalar: UnicodeScalar, isInRanges ranges: [Range<Int>]) -> Bool {
for r in ranges {
if r ~= Int(scalar.value) {
return true
}
}
return false
}
let str = "Hello world"
var isLatin = true
var isCyrillic = true
for s in "Hello world".unicodeScalars {
if scalar(s, isInRanges: neutralRanges) {
continue
}
else if !scalar(s, isInRanges: latinRanges) {
isLatin = false
}
else if !scalar(s, isInRanges: cyrillicRanges) {
isCyrillic = false
}
}
print(isLatin)
print(isCyrillic)
A couple of comments refer to another post that shows a fairly clean way to determine the language of a String using NSLinguisticTagger (How to detect text (string) language in iOS? ).
NSLinguisticTagger is definitely the best approach here and is intended exactly for this purpose, but it sounds to me like you're actually asking how to identify the script of the String rather than the language. English, French, German (for example) all use Latin script so the language example above doesn't show the ideal way to discern between Latin and Cyrillic (or other scripts).
Instead I wrote the following extension to String that shows how to identify the script for the first sentence in the String you supply - you can then easily adapt/build on this to get the exact thing you want for your use case:
import Foundation // Needed for NSLinguisticTagger
extension String {
func scriptCode() -> NSLinguisticTag? {
let linguisticTagger = NSLinguisticTagger(tagSchemes: [.script], options: 0)
linguisticTagger.string = self
return iso15924ScriptCode = linguisticTagger.tag(at: 0, unit: .sentence, scheme: .script, tokenRange: nil)
}
}
Scripts are uniformly described by four-letter ISO 15924 script codes, such as "Latn", and this is what you get with the returned NSLinguisticTag object. To perform a comparison, just check the raw value of NSLinguisticTag, for example like this:
if yourTestSentence.scriptCode()? == "Latn" || "Cyrl" {
print("This sentence is in Latin or Cyrillic script")
} else {
print("Some other script")
}
Caveat: This example only checks the first sentence of whatever string you supply. I haven't tested what happens if that sentence is mixed scripts - most likely the returned tag will be nil.
Here are some useful reference links to Apple's docs, and Wikipedia for more info:
https://developer.apple.com/documentation/foundation/nslinguistictagger
https://developer.apple.com/documentation/foundation/nslinguistictagscheme
https://en.wikipedia.org/wiki/ISO_15924
I hope that this also can be useful
let cyrillicToLatinMap: [Character : String] = [
" ":" ",
"А":"A",
"Б":"B",
"В":"V",
"Г":"G",
"Д":"D",
"Е":"E",
"Ж":"Zh",
"З":"Z",
"И":"I",
"Й":"Y",
"К":"K",
"Л":"L",
"М":"M",
"Н":"N",
"О":"O",
"П":"P",
"Р":"R",
"С":"S",
"Т":"T",
"У":"U",
"Ф":"F",
"Х":"H",
"Ц":"Ts",
"Ч":"Ch",
"Ш":"Sh",
"Щ":"Sht",
"Ъ": "A",
"Ю":"Yu",
"Я":"Ya",
"а":"a",
"б":"b",
"в":"v",
"г":"g",
"д":"d",
"е":"e",
"ж":"zh",
"з":"z",
"и":"i",
"й":"y",
"к":"k",
"л":"l",
"м":"m",
"н":"n",
"о":"o",
"п":"p",
"р":"r",
"с":"s",
"т":"t",
"у":"u",
"ф":"f",
"х":"h",
"ц":"ts",
"ч":"ch",
"ш":"sh",
"щ":"sht",
"ъ": "a",
"ь":"y",
"ю":"yu",
"я":"ya",]
Bulgarian Cyrillic to Latin
class CyrilicToLatinConverter {
public static func getLatin(wordInCyrillic: String) -> String{
if(wordInCyrillic.isEmpty) {return wordInCyrillic}
else{
let characters = Array(wordInCyrillic)
var wordInLatin: String = ""
for n in 0...characters.capacity-1 {
if isCyrillic(characters: characters[n]) {
wordInLatin+=cyrillicToLatinMap[characters[n]] ?? ""
}
else{
return ""
}
}
return wordInLatin
}
}
public static func isCyrillic(characters: Character) -> Bool {
var isCyrillic: Bool = true;
for (key,_) in cyrillicToLatinMap{
isCyrillic = (key == characters)
if isCyrillic {
break
}
}
return isCyrillic
}
Swift 3:
For Persian and Arabic
extension String {
var isFarsi: Bool {
//Remove extra spaces from the first and last word
let value = self.trimmingCharacters(in: CharacterSet.whitespacesAndNewlines)
if value == "" {
return false
}
let farsiLetters = "آ ا ب پ ت ث ج چ ح خ د ذ ر ز ژ س ش ص ض ط ظ ع غ ف ق ک گ ل م ی ن و ه"
let arabicLetters = " ء ا أ إ ء ؤ ئـ ئ آ اً ة ا ب ت ث ج ‌ ح خ د ذ ر ز س ‌ ش ص ض ط ظ ع غ ف ق ك ل م ن ه و ي"
for c in value.characters.map({ String($0) }) {
if !farsiLetters.contains(c) && !arabicLetters.contains(c) {
return false
}
}
return true
}
}
swift 5 solution
extension String {
var isLatin: Bool {
let upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
let lower = "abcdefghijklmnopqrstuvwxyz"
for c in self.map({String($0)}) where !upper.contains(c) && !lower.contains(c) {
return false
}
return true
}
}

Replace part of string with lower case letters - Swift

I have a Swift based iOS app and one of the features allows you to comment on a post. Anyway, users can add "#mentions" in their posts to tag other people. However I want to stop the user from adding a username with a capital letter.
Is there anyway I can convert a string, so that the #usernames are all in lowercase?
For example:
I really enjoy sightseeing with #uSerABC (not allowed)
I really enjoy sightseeing with #userabc (allowed)
I know there is a property for the string in swift called .lowercaseString - but the problem with that, is that it makes the entire string lowercase and thats not what I want. I only want the #username to be in lower case.
Is there any way around this with having to use the .lowercase property.
Thanks for your time, Dan.
This comes from a code I use to detect hashtags, I've modified to detect mentions:
func detectMentionsInText(text: String) -> [NSRange]? {
let mentionsDetector = try? NSRegularExpression(pattern: "#(\\w+)", options: NSRegularExpressionOptions.CaseInsensitive)
let results = mentionsDetector?.matchesInString(text, options: NSMatchingOptions.WithoutAnchoringBounds, range: NSMakeRange(0, text.utf16.count)).map { $0 }
return results?.map{$0.rangeAtIndex(0)}
}
It detects all the mentions in a string by using a regex and returns an NSRange array, by using a range you have the beginning and the end of the "mention" and you can easily replace them with a lower case version.
Split the string into two using the following command -
let arr = myString.componentsSeparatedByString("#")
//Convert arr[1] to lower case
//Append to arr[0]
//Enjoy
Thanks to everyone for their help. In the end I couldn't get any of the solutions to work and after a lot of testing, I came up with this solution:
func correctStringWithUsernames(inputString: String, completion: (correctString: String) -> Void) {
// Create the final string and get all
// the seperate strings from the data.
var finalString: String!
var commentSegments: NSArray!
commentSegments = inputString.componentsSeparatedByString(" ")
if (commentSegments.count > 0) {
for (var loop = 0; loop < commentSegments.count; loop++) {
// Check the username to ensure that there
// are no capital letters in the string.
let currentString = commentSegments[loop] as! String
let capitalLetterRegEx = ".*[A-Z]+.*"
let textData = NSPredicate(format:"SELF MATCHES %#", capitalLetterRegEx)
let capitalResult = textData.evaluateWithObject(currentString)
// Check if the current loop string
// is a #user mention string or not.
if (currentString.containsString("#")) {
// If we are in the first loop then set the
// string otherwise concatenate the string.
if (loop == 0) {
if (capitalResult == true) {
// The username contains capital letters
// so change it to a lower case version.
finalString = currentString.lowercaseString
}
else {
// The username does not contain capital letters.
finalString = currentString
}
}
else {
if (capitalResult == true) {
// The username contains capital letters
// so change it to a lower case version.
finalString = "\(finalString) \(currentString.lowercaseString)"
}
else {
// The username does not contain capital letters.
finalString = "\(finalString) \(currentString)"
}
}
}
else {
// The current string is NOT a #user mention
// so simply set or concatenate the finalString.
if (loop == 0) {
finalString = currentString
}
else {
finalString = "\(finalString) \(currentString)"
}
}
}
}
else {
// No issues pass back the string.
finalString = inputString
}
// Pass back the correct username string.
completion(correctString: finalString)
}
Its certainly not the most elegant or efficient solution around but it does work. If there are any ways of improving it, please leave a comment.

Resources