How to get range of specific substring even if a duplicate - ios

I want to detect the words that begin with a #, and return their specific ranges. Initially I tried using the following code:
for word in words {
if word.hasPrefix("#") {
let matchRange = theSentence.range(of: word)
//Do stuff with this word
}
}
This works fine, except if you have a duplicate hashtag it will return the range of the first occurrence of the hashtag. This is because of the nature of the range(_:) function.
Say I have the following string:
"The range of #hashtag should be different to this #hashtag"
This will return (13, 8) for both hashtags, when really it should return (13, 8) as well as (50, 8). How can this be fixed? Please note that emojis should be able to be detected in the hashtag too.
EDIT
If you want to know how to do this with emojis to, go here

Create regex for that and use it with the NSRegularExpression and find the matches range.
var str = "The range of #hashtag should be different to this #hashtag"
let regex = try NSRegularExpression(pattern: "(#[A-Za-z0-9]*)", options: [])
let matches = regex.matchesInString(str, options:[], range:NSMakeRange(0, str.characters.count))
for match in matches {
print("match = \(match.range)")
}

Why don't you separate your word in chunks where each chunk starts with #. Then you can know how many times your word with # appears in sentence.
Edit: I think that regex answer is the best way for this but this is an other approach for same solution.
var hastagWords = [""]
for word in words {
if word.hasPrefix("#") {
// Collect all words which begin with # in an array
hastagWords.append(word)
}
}
// Create a copy of original word since we will change it
var mutatedWord = word.copy() as! String
for hashtagWord in hastagWords {
let range = mutatedWord.range(of: hashtagWord)
if let aRange = range {
// If range is OK then remove the word from original word and go to an other range
mutatedWord = mutatedWord.replacingCharacters(in: aRange, with: "")
}
}

Related

iOS Swift: looking for ranges of matching word in a string

I need to make a function that returns me ranges of matching words in a given string, for example, given the sentence below:
Hey, bro! Your brother is also her brother.
I want to find an array of Range in the sentence that matches the word "bro", it should match the exact word (case insensitive), so "bro" should only match "bro" but not "brother".
I thought about:
split the sentence, e.g. "hey", "bro", "your", "brother", "is", "also", "her", "brother"
map each word to a word with range, e.g. "hey" would become ["hey", 0...2]
filter and map the word and range array, matching "bro"
Step 2 needs some treatment to make sure the range for each word (in the sentence) can be mapped to the right word, e.g. the first "brother" and second "brother" should have different ranges depending on where they are located.
Is there any smarter way of doing this?
Edit:
Sorry, I forgot to mention, the reason for not using Regex was that sometimes the word has a dot in it, for example:
there is orange in the basket.
from the above sentence, finding the string "or.ge" using regex would match "orange" as well.
I have tested in Playground, You can use this extension to get the values matching this reg ex.
extension String {
func ranges(of substring: String, options: CompareOptions = [], locale: Locale? = nil) -> [Range<Index>] {
var ranges: [Range<Index>] = []
while ranges.last.map({ $0.upperBound < self.endIndex }) ?? true,
let range = self.range(of: substring, options: options, range: (ranges.last?.upperBound ?? self.startIndex)..<self.endIndex, locale: locale)
{
ranges.append(range)
}
return ranges
}
}
let searchString = "bro"
var str = "Hey, bro! Your brother is also her brother."
var reg = str.ranges(of: "(?<![\\p{L}\\d])\(searchString)(?![\\p{L}\\d])", options: [.regularExpression, .caseInsensitive])
str.removeSubrange(reg.first!)
print(str)
Credits to,
iOS - regex to match word boundary, including underscore
One simple solution is to use regular expressions with \b to match “word boundaries”, e.g.
let searchString = "bro"
let sentence = "Hey, Bro! Your brother is also her brother."
let regex = try! NSRegularExpression(pattern: #"\b\#(searchString)\b"#, options: .caseInsensitive)
regex.enumerateMatches(in: sentence, range: NSRange(sentence.startIndex..., in: sentence)) { match, _, _ in
guard let match = match else { return }
print(match.range)
// or, if you want a String.Range
if let range = Range(match.range, in: sentence) {
print(sentence[range])
}
}
There are other richer API (e.g. the Natural Language framework), which, while not perfect, provide richer parsing of natural language text. For example, the below will differentiate between the verb “saw” and noun “saw”:
import NaturalLanguage
let text = "I saw the hammer. I did not see a saw."
let tagger = NLTagger(tagSchemes: [.lexicalClass])
tagger.string = text
let options: NLTagger.Options = [.omitWhitespace, .joinContractions]
tagger.enumerateTags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass, options: options) { tag, range in
guard let tag = tag else { return true }
print(tag, String(text[range]))
return true
}
Producing:
NLTag(_rawValue: Pronoun) I
NLTag(_rawValue: Verb) saw
NLTag(_rawValue: Determiner) the
NLTag(_rawValue: Noun) hammer
NLTag(_rawValue: SentenceTerminator) .
NLTag(_rawValue: Pronoun) I
NLTag(_rawValue: Verb) did
NLTag(_rawValue: Adverb) not
NLTag(_rawValue: Verb) see
NLTag(_rawValue: Determiner) a
NLTag(_rawValue: Noun) saw
NLTag(_rawValue: SentenceTerminator) .

Swift: Get an index of beginning and ending character of a word in a String

A string:
"jim#domain.com, bill#domain.com, chad#domain.com, tom#domain.com"
Through gesture recognizer, I am able to get the character the user tapped on (happy to provide code, but don't see the relevance at this point).
Let's say the User tapped on o in "chad#domain.com" and the character index is 39
Given 39 the index of o, I would like to get the string start index of c where "chad#domain.com" begins, and an end index for m from "com" where "chad#domain.com" ends.
In another words, given an index of a character in a String, I need to get the index on the left and right right before we encounter a space in a String on the left and a comma on the right.
Tried, but this only provides the last word in the String:
if let range = text.range(of: " ", options: .backwards) {
let suffix = String(text.suffix(from: range.upperBound))
print(suffix) // tom#domain.com
}
I am not sure where to go from here?
You can call range(of:) on two slices of the given string:
text[..<index] is the text preceding the given character position,
and text[index...] is the text starting at the given position.
Example:
let text = "jim#domain.com, bill#domain.com, chad#domain.com, tom#domain.com"
let index = text.index(text.startIndex, offsetBy: 39)
// Search the space before the given position:
let start = text[..<index].range(of: " ", options: .backwards)?.upperBound ?? text.startIndex
// Search the comma after the given position:
let end = text[index...].range(of: ",")?.lowerBound ?? text.endIndex
print(text[start..<end]) // chad#domain.com
Both range(of:) calls return nil if no space (or comma) has
been found. In that case the nil-coalescing operator ?? is used
to get the start (or end) index instead.
(Note that this works because Substrings share a common index
with their originating string.)
An alternative approach is to use a "data detector",
so that the URL detection does not depend on certain separators.
Example (compare How to detect a URL in a String using NSDataDetector):
let text = "jim#domain.com, bill#domain.com, chad#domain.com, tom#domain.com"
let index = text.index(text.startIndex, offsetBy: 39)
let detector = try! NSDataDetector(types: NSTextCheckingResult.CheckingType.link.rawValue)
let matches = detector.matches(in: text, range: NSRange(location: 0, length: text.utf16.count))
for match in matches {
if let range = Range(match.range, in: text), range.contains(index) {
print(text[range])
}
}
Different approach:
You have the string and the Int index
let string = "jim#domain.com, bill#domain.com, chad#domain.com, tom#domain.com"
let characterIndex = 39
Get the String.Index from the Int
let stringIndex = string.index(string.startIndex, offsetBy: characterIndex)
Convert the string into an array of addresses
let addresses = string.components(separatedBy: ", ")
Map the addresses to their ranges (Range<String.Index>) in the string
let ranges = addresses.map{string.range(of: $0)!}
Get the (Int) index of the range which contains stringIndex
if let index = ranges.index(where: {$0.contains(stringIndex)}) {
Get the corresponding address
let address = addresses[index] }
One approach could be to split the original string on the “,” and then using simple math to find in what element of the array the given position (39) exist and from there get the right string or indexes for the previous space and next comma depending on what your end goal is.

Find index of Nth instance of substring in string in Swift

My Swift app involves searching through text in a UITextView. The user can search for a certain substring within that text view, then jump to any instance of that string in the text view (say, the third instance). I need to find out the integer value of which character they are on.
For example:
Example 1: The user searches for "hello" and the text view reads "hey hi hello, hey hi hello", then the user presses down arrow to view second instance. I need to know the integer value of the first h in the second hello (i.e. which # character that h in hello is within the text view). The integer value should be 22.
Example 2: The user searches for "abc" while the text view reads "abcd" and they are looking for the first instance of abc, so the integer value should be 1 (which is the integer value of that a since it's the first character of the instance they're searching for).
How can I get the index of the character the user is searching for?
Xcode 11 • Swift 5 or later
let sentence = "hey hi hello, hey hi hello"
let query = "hello"
var searchRange = sentence.startIndex..<sentence.endIndex
var indices: [String.Index] = []
while let range = sentence.range(of: query, options: .caseInsensitive, range: searchRange) {
searchRange = range.upperBound..<searchRange.upperBound
indices.append(range.lowerBound)
}
print(indices) // "[7, 21]\n"
Another approach is NSRegularExpression which is designed to easily iterate through matches in an string. And if you use the .ignoreMetacharacters option, it will not apply any sophisticated wildcard/regex logic, but will just look for the string in question. So consider:
let string = "hey hi hello, hey hi hello" // string to search within
let searchString = "hello" // string to search for
let matchToFind = 2 // grab the second occurrence
let regex = try! NSRegularExpression(pattern: searchString, options: [.caseInsensitive, .ignoreMetacharacters])
You could use enumerateMatches:
var count = 0
let range = NSRange(string.startIndex ..< string.endIndex, in: string)
regex.enumerateMatches(in: string, range: range) { result, _, stop in
count += 1
if count == matchToFind {
print(result!.range.location)
stop.pointee = true
}
}
Or you can just find all of them with matches(in:range:) and then grab the n'th one:
let matches = regex.matches(in: string, range: range)
if matches.count >= matchToFind {
print(matches[matchToFind - 1].range.location)
}
Obviously, if you were so inclined, you could omit the .ignoreMetacharacters option and allow the user to perform regex searches, too (e.g. wildcards, whole word searches, start of word, etc.).
For Swift 2, see previous revision of this answer.

Return range with first and last character in string

I have a string: "Hey #username that's funny". For a given string, how can I search the string to return all ranges of string with first character # and last character to get the username?
I suppose I can get all indexes of # and for each, get the substringToIndex of the next space character, but wondering if there's an easier way.
If your username can contain only letters and numbers, you can use regular expression for that:
let s = "Hey #username123 that's funny"
if let r = s.rangeOfString("#\\w+", options: NSStringCompareOptions.RegularExpressionSearch) {
let name = s.substringWithRange(r) // #username123"
}
#Vladimir's answer is correct, but if you're trying to find multiple occurrences of "username", this should also work:
let s = "Hey #username123 that's funny"
let ranges: [NSRange]
do {
// Create the regular expression.
let regex = try NSRegularExpression(pattern: "#\\w+", options: [])
// Use the regular expression to get an array of NSTextCheckingResult.
// Use map to extract the range from each result.
ranges = regex.matchesInString(s, options: [], range: NSMakeRange(0, s.characters.count)).map {$0.range}
}
catch {
// There was a problem creating the regular expression
ranges = []
}
for range in ranges {
print((s as NSString).substringWithRange(range))
}

Parsing & contracting Russian full names

I have several text fields used to enter full name and short name, among other data. My task is:
Check if entered full name matches the standard Russian Cyrillic full name pattern:
Иванов Иван Иванович (three capitalized Cyrillic strings separated by spaces)
If it matches, create another string by auto-contracting full name according to pattern below and enter it to the corresponding text field:
Иванов И.И. (first string, space, first character of the second string, dot, first character of the third string, dot)
If it doesn't match, do nothing.
Currently I use the following code:
let fullNameArray = fullNameField.text!.characters.split{$0 == " "}.map(String.init)
if fullNameArray.count == 3 {
if fullNameArray[0] == fullNameArray[0].capitalizedString && fullNameArray[1] == fullNameArray[1].capitalizedString && fullNameArray[2] == fullNameArray[2].capitalizedString {
shortNameField.text = "\(fullNameArray[0]) \(fullNameArray[1].characters.first!).\(fullNameArray[2].characters.first!)."
}
}
How can I improve it? Maybe regular expressions could help me? If so, could you post some sample code?
My current solution:
do {
let regex = try NSRegularExpression(pattern: "^\\p{Lu}\\p{Ll}+\\s\\p{Lu}\\p{Ll}+\\s\\p{Lu}\\p{Ll}+$", options: .AnchorsMatchLines)
if regex.firstMatchInString(fullNameField.text!, options: [], range: NSMakeRange(0, fullNameField.text!.characters.count)) != nil {
let fullNameArray = fullNameField.text!.characters.split(" ").map(String.init)
shortNameField.text = "\(fullNameArray[0]) \(fullNameArray[1].characters.first!).\(fullNameArray[2].characters.first!)."
}
else {
shortNameField.text = ""
}
} catch let error as NSError {
print(error.localizedDescription)
}
Processes my full name pattern correctly.

Resources