Return range with first and last character in string - ios

I have a string: "Hey #username that's funny". For a given string, how can I search the string to return all ranges of string with first character # and last character to get the username?
I suppose I can get all indexes of # and for each, get the substringToIndex of the next space character, but wondering if there's an easier way.

If your username can contain only letters and numbers, you can use regular expression for that:
let s = "Hey #username123 that's funny"
if let r = s.rangeOfString("#\\w+", options: NSStringCompareOptions.RegularExpressionSearch) {
let name = s.substringWithRange(r) // #username123"
}

#Vladimir's answer is correct, but if you're trying to find multiple occurrences of "username", this should also work:
let s = "Hey #username123 that's funny"
let ranges: [NSRange]
do {
// Create the regular expression.
let regex = try NSRegularExpression(pattern: "#\\w+", options: [])
// Use the regular expression to get an array of NSTextCheckingResult.
// Use map to extract the range from each result.
ranges = regex.matchesInString(s, options: [], range: NSMakeRange(0, s.characters.count)).map {$0.range}
}
catch {
// There was a problem creating the regular expression
ranges = []
}
for range in ranges {
print((s as NSString).substringWithRange(range))
}

Related

iOS Swift: looking for ranges of matching word in a string

I need to make a function that returns me ranges of matching words in a given string, for example, given the sentence below:
Hey, bro! Your brother is also her brother.
I want to find an array of Range in the sentence that matches the word "bro", it should match the exact word (case insensitive), so "bro" should only match "bro" but not "brother".
I thought about:
split the sentence, e.g. "hey", "bro", "your", "brother", "is", "also", "her", "brother"
map each word to a word with range, e.g. "hey" would become ["hey", 0...2]
filter and map the word and range array, matching "bro"
Step 2 needs some treatment to make sure the range for each word (in the sentence) can be mapped to the right word, e.g. the first "brother" and second "brother" should have different ranges depending on where they are located.
Is there any smarter way of doing this?
Edit:
Sorry, I forgot to mention, the reason for not using Regex was that sometimes the word has a dot in it, for example:
there is orange in the basket.
from the above sentence, finding the string "or.ge" using regex would match "orange" as well.
I have tested in Playground, You can use this extension to get the values matching this reg ex.
extension String {
func ranges(of substring: String, options: CompareOptions = [], locale: Locale? = nil) -> [Range<Index>] {
var ranges: [Range<Index>] = []
while ranges.last.map({ $0.upperBound < self.endIndex }) ?? true,
let range = self.range(of: substring, options: options, range: (ranges.last?.upperBound ?? self.startIndex)..<self.endIndex, locale: locale)
{
ranges.append(range)
}
return ranges
}
}
let searchString = "bro"
var str = "Hey, bro! Your brother is also her brother."
var reg = str.ranges(of: "(?<![\\p{L}\\d])\(searchString)(?![\\p{L}\\d])", options: [.regularExpression, .caseInsensitive])
str.removeSubrange(reg.first!)
print(str)
Credits to,
iOS - regex to match word boundary, including underscore
One simple solution is to use regular expressions with \b to match “word boundaries”, e.g.
let searchString = "bro"
let sentence = "Hey, Bro! Your brother is also her brother."
let regex = try! NSRegularExpression(pattern: #"\b\#(searchString)\b"#, options: .caseInsensitive)
regex.enumerateMatches(in: sentence, range: NSRange(sentence.startIndex..., in: sentence)) { match, _, _ in
guard let match = match else { return }
print(match.range)
// or, if you want a String.Range
if let range = Range(match.range, in: sentence) {
print(sentence[range])
}
}
There are other richer API (e.g. the Natural Language framework), which, while not perfect, provide richer parsing of natural language text. For example, the below will differentiate between the verb “saw” and noun “saw”:
import NaturalLanguage
let text = "I saw the hammer. I did not see a saw."
let tagger = NLTagger(tagSchemes: [.lexicalClass])
tagger.string = text
let options: NLTagger.Options = [.omitWhitespace, .joinContractions]
tagger.enumerateTags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass, options: options) { tag, range in
guard let tag = tag else { return true }
print(tag, String(text[range]))
return true
}
Producing:
NLTag(_rawValue: Pronoun) I
NLTag(_rawValue: Verb) saw
NLTag(_rawValue: Determiner) the
NLTag(_rawValue: Noun) hammer
NLTag(_rawValue: SentenceTerminator) .
NLTag(_rawValue: Pronoun) I
NLTag(_rawValue: Verb) did
NLTag(_rawValue: Adverb) not
NLTag(_rawValue: Verb) see
NLTag(_rawValue: Determiner) a
NLTag(_rawValue: Noun) saw
NLTag(_rawValue: SentenceTerminator) .

Convert placeholders such as %1$s to {x} in Swift

I'm parsing an XML doc (using XMLParser) and some of the values have php-like placeholders, e.g. %1$s, and I would like to convert those to {x-1}.
Examples:
%1$s ---> {0}
%2$s ---> {1}
I'm doing this in a seemingly hacky way, using regex:
But there must be a better implementation of this regex.
Consider a string:
let str = "lala fawesfgeksgjesk 3rf3f %1$s rk32mrk3mfa %2$s fafafczcxz %3$s czcz $#$##%## %4$s qqq %5$s"
Now we're going to extract the integer strings between strings % and $s:
let regex = try! NSRegularExpression(pattern: "(?<=%)[^$s]+")
let range = NSRange(location: 0, length: str.utf16.count)
let matches = regex.matches(in: str, options: [], range: range)
matches.map {
print(String(str[Range($0.range, in: str)!]))
}
Works quite fine. The issue is that the "4" value got mixed up because of the preceding random strings before the %4$s.
Prints:
1
2
3
## %4
5
Is there any better way to do this?
This might not be a very efficient (or swifty :)) way but it gets the job done. What it does is that it searches for a given reg ex and uses the matched substring to extract the numeric value and decrease it and then perform a simple replace between the substring and a newly constructed placeholder value. This is executed in a loop until no more matches are found.
let pattern = #"%(\d*)\$s"#
while let range = str.range(of: pattern, options: .regularExpression) {
let placeholder = str[range]
let number = placeholder.trimmingCharacters(in: CharacterSet(charactersIn: "0123456789.").inverted)
if let value = Int(number) {
str = str.replacingOccurrences(of: placeholder, with: "{\(value - 1)}")
}
}

Swift: Get an index of beginning and ending character of a word in a String

A string:
"jim#domain.com, bill#domain.com, chad#domain.com, tom#domain.com"
Through gesture recognizer, I am able to get the character the user tapped on (happy to provide code, but don't see the relevance at this point).
Let's say the User tapped on o in "chad#domain.com" and the character index is 39
Given 39 the index of o, I would like to get the string start index of c where "chad#domain.com" begins, and an end index for m from "com" where "chad#domain.com" ends.
In another words, given an index of a character in a String, I need to get the index on the left and right right before we encounter a space in a String on the left and a comma on the right.
Tried, but this only provides the last word in the String:
if let range = text.range(of: " ", options: .backwards) {
let suffix = String(text.suffix(from: range.upperBound))
print(suffix) // tom#domain.com
}
I am not sure where to go from here?
You can call range(of:) on two slices of the given string:
text[..<index] is the text preceding the given character position,
and text[index...] is the text starting at the given position.
Example:
let text = "jim#domain.com, bill#domain.com, chad#domain.com, tom#domain.com"
let index = text.index(text.startIndex, offsetBy: 39)
// Search the space before the given position:
let start = text[..<index].range(of: " ", options: .backwards)?.upperBound ?? text.startIndex
// Search the comma after the given position:
let end = text[index...].range(of: ",")?.lowerBound ?? text.endIndex
print(text[start..<end]) // chad#domain.com
Both range(of:) calls return nil if no space (or comma) has
been found. In that case the nil-coalescing operator ?? is used
to get the start (or end) index instead.
(Note that this works because Substrings share a common index
with their originating string.)
An alternative approach is to use a "data detector",
so that the URL detection does not depend on certain separators.
Example (compare How to detect a URL in a String using NSDataDetector):
let text = "jim#domain.com, bill#domain.com, chad#domain.com, tom#domain.com"
let index = text.index(text.startIndex, offsetBy: 39)
let detector = try! NSDataDetector(types: NSTextCheckingResult.CheckingType.link.rawValue)
let matches = detector.matches(in: text, range: NSRange(location: 0, length: text.utf16.count))
for match in matches {
if let range = Range(match.range, in: text), range.contains(index) {
print(text[range])
}
}
Different approach:
You have the string and the Int index
let string = "jim#domain.com, bill#domain.com, chad#domain.com, tom#domain.com"
let characterIndex = 39
Get the String.Index from the Int
let stringIndex = string.index(string.startIndex, offsetBy: characterIndex)
Convert the string into an array of addresses
let addresses = string.components(separatedBy: ", ")
Map the addresses to their ranges (Range<String.Index>) in the string
let ranges = addresses.map{string.range(of: $0)!}
Get the (Int) index of the range which contains stringIndex
if let index = ranges.index(where: {$0.contains(stringIndex)}) {
Get the corresponding address
let address = addresses[index] }
One approach could be to split the original string on the “,” and then using simple math to find in what element of the array the given position (39) exist and from there get the right string or indexes for the previous space and next comma depending on what your end goal is.

Cut a String from start position to end position with swift 3

I have Strings with the form string \ string example
"some sting with random length\233"
I want to deletes the last \ and get the value after it, so the result will be
"some sting with random length"
"233"
I tried this code but it's not working
let regex = try! NSRegularExpression(pattern: "\\\s*(\\S[^,]*)$")
if let match = regex.firstMatch(in: string, range: string.nsRange) {
let result = string.substring(with: match.rangeAt(1))
}
You did not correctly adapt the pattern from How to get substring after last occurrence of character in string: Swift IOS to your case. Both instances of the comma must be replaced by a backslash,
and that must be "double-escaped":
let regex = try! NSRegularExpression(pattern: "\\\\\\s*(\\S[^\\\\]*)$")
(once be interpreted as a literal backslash in the regex pattern, and
once more in the definition of a Swift string literal).
However, a simpler solution is to find the last occurrence of the
backslash and extract the suffix from that position:
let string = "some sting with random length\\233"
let separator = "\\" // A single(!) backslash
if let r = string.range(of: separator, options: .backwards) {
let prefix = string.substring(to: r.lowerBound)
let suffix = string.substring(from: r.upperBound)
print(prefix) // some sting with random length
print(suffix) // 233
}
Update for Swift 4:
if let r = string.range(of: separator, options: .backwards) {
let prefix = string[..<r.lowerBound]
let suffix = string[r.upperBound...]
print(prefix) // some sting with random length
print(suffix) // 233
}
prefix and suffix are a String.SubSequence, which can be used
in many places instead of a String. If necessary, create a real
string:
let prefix = String(string[..<r.lowerBound])
let suffix = String(string[r.upperBound...])
You could do this with regex, but I think this solution is better:
yourString.components(separatedBy: "\\").last!
It splits the string with \ as the separator and gets the last split.

How to get range of specific substring even if a duplicate

I want to detect the words that begin with a #, and return their specific ranges. Initially I tried using the following code:
for word in words {
if word.hasPrefix("#") {
let matchRange = theSentence.range(of: word)
//Do stuff with this word
}
}
This works fine, except if you have a duplicate hashtag it will return the range of the first occurrence of the hashtag. This is because of the nature of the range(_:) function.
Say I have the following string:
"The range of #hashtag should be different to this #hashtag"
This will return (13, 8) for both hashtags, when really it should return (13, 8) as well as (50, 8). How can this be fixed? Please note that emojis should be able to be detected in the hashtag too.
EDIT
If you want to know how to do this with emojis to, go here
Create regex for that and use it with the NSRegularExpression and find the matches range.
var str = "The range of #hashtag should be different to this #hashtag"
let regex = try NSRegularExpression(pattern: "(#[A-Za-z0-9]*)", options: [])
let matches = regex.matchesInString(str, options:[], range:NSMakeRange(0, str.characters.count))
for match in matches {
print("match = \(match.range)")
}
Why don't you separate your word in chunks where each chunk starts with #. Then you can know how many times your word with # appears in sentence.
Edit: I think that regex answer is the best way for this but this is an other approach for same solution.
var hastagWords = [""]
for word in words {
if word.hasPrefix("#") {
// Collect all words which begin with # in an array
hastagWords.append(word)
}
}
// Create a copy of original word since we will change it
var mutatedWord = word.copy() as! String
for hashtagWord in hastagWords {
let range = mutatedWord.range(of: hashtagWord)
if let aRange = range {
// If range is OK then remove the word from original word and go to an other range
mutatedWord = mutatedWord.replacingCharacters(in: aRange, with: "")
}
}

Resources