optional capture groups with NSRegularExpressions in swift

optional capture groups with NSRegularExpressions in swift - ios

I want to have multiple capture groups that can be optional and I want to access the strings they correspond to.
Something that looks/works like this:
let text1 = "something with foo and bar"
let text2 = "something with just bar"
let regex = NSRegularExpression(pattern: "(foo)? (bar)")
for (first?, second) in regex.matches(in:text1) {
print(first) // foo
print(second) // bar
}
for (first?, second) in regex.matches(in:text2) {
print(first) // nil
print(second) // bar
}

Retrieving captured subtext with NSRegularExpression is not so easy.
First of all, the result of matches(in:range:) is [NSTextCheckingResult], and each NSTextCheckingResult does not match to tuple like (first?, second).
If you want to retrieve captured subtext, you need to get the range from the NSTextCheckingResult with rangeAt(_:) method. rangeAt(0) represents the range matching the whole pattern, rangeAt(1) for the first capture, rangeAt(2) for the second, and so on.
And rangeAt(_:) returns an NSRange, not Swift Range. The content (location and length) is based on the UTF-16 representation of NSString.
And this is the most important part for your purpose, rangeAt(_:) returns NSRange(location: NSNotFound, length: 0) for each missing capture.
So, you may need to write something like this:
let text1 = "something with foo and bar"
let text2 = "something with just bar"
let regex = try! NSRegularExpression(pattern: "(?:(foo).*)?(bar)") //please find a better example...
for match in regex.matches(in: text1, range: NSRange(0..<text1.utf16.count)) {
let firstRange = match.rangeAt(1)
let secondRange = match.rangeAt(2)
let first = firstRange.location != NSNotFound ? (text1 as NSString).substring(with: firstRange) : nil
let second = (text1 as NSString).substring(with: secondRange)
print(first) // Optioonal("foo")
print(second) // bar
}
for match in regex.matches(in: text2, range: NSRange(0..<text2.utf16.count)) {
let firstRange = match.rangeAt(1)
let secondRange = match.rangeAt(2)
let first = firstRange.location != NSNotFound ? (text2 as NSString).substring(with: firstRange) : nil
let second = (text2 as NSString).substring(with: secondRange)
print(first) // nil
print(second) // bar
}

Related

Swift 5: Filter a string of characters, getting only the numbers with a condition

i want filter a string and get only the numbers, but the numbers with a count of characters for example 10, and the other numbers that dont meet the condition discard.
I try something like this:
let phoneAddress = "My phone number 2346172891, and my address is Florida 2234"
let withTrimming = phoneAddress.replacingOccurrences(of: "-", with: "")
.trimmingCharacters(in: CharacterSet(charactersIn: "0123456789").inverted)
let withComponents = phoneAddress.components(separatedBy: CharacterSet.decimalDigits.inverted).joined()
But this return
withTrimming = "2346172891, and my address is Florida 2234"
withComponents = "23461728912234"
When i only want the phone number string "2346172891", i dont know how i can resolve it.

You can use Regex
let phoneAddress = "My phone number 2346172891, and my address is Florida 2234"
let regex = (try? NSRegularExpression(pattern: "[0-9]{10}"))!
let ranges = regex.matches(in: phoneAddress, range: NSRange(location: 0, length: phoneAddress.count))
let phones: [String] = ranges.map {
let startIndex = phoneAddress.index(phoneAddress.startIndex, offsetBy: $0.range.lowerBound)
let endIndex = phoneAddress.index(phoneAddress.startIndex, offsetBy: $0.range.upperBound)
return String(phoneAddress[startIndex..<endIndex])
}

Use a regex such as \d{10}:
let string = "My phone number 2346172891, and my address is Florida 2234"
do {
let regexMatches = try NSRegularExpression(pattern: "\\d{10}").matches(in: string, range: NSRange(string.startIndex..., in: string))
// prints out all the phone numbers, one on each line
for match in regexMatches {
guard let range = Range(match.range, in: string) else { continue }
print(string[range])
}
} catch {
print(error)
}
// Output:
// 2346172891
Also, consider using NSDataDetector.

how can I edit lots of swift string at once?

In my project, I have Localizable.string file which is having more than 10,000 lines keyValue format.
I need to convert all of keys which are dotCase format like "contentsList.sort.viewCount" to lowerCamelCase. how can I convert by using swift scripting? thank you.
as-is
"contentsList.horizontal.more" = "totall";
to-be
"contentsListHorizontalMore" = "totall";

First get all lines from your string. CompactMap your lines breaking it up into two components separated by the equal sign. Get the first component otherwise return nil. Get all ranges of the regex (\w)\.(\w). Replace the match range by the first + second group capitalized. This will remove the period. Return a collection of one element (snake case) + the other components joined by the separator equal sign. Now that you have all lines you just need to join them by the new line character:
let string = """
"contentsList.horizontal.more" = "totall";
"whatever.vertical.less" = "summ";
"""
let pattern = #"(\w)\.(\w)"#
let lines = string.split(omittingEmptySubsequences: false,
whereSeparator: \.isNewline)
let result: [String] = lines.compactMap {
let comps = $0.components(separatedBy: " = ")
guard var first = comps.first else { return nil }
let regex = try! NSRegularExpression(pattern: pattern)
let matches = regex.matches(in: first, range: NSRange(first.startIndex..., in: first))
let allRanges: [[Range<String.Index>]] = matches.map { match in
(0..<match.numberOfRanges).compactMap { (index: Int) -> Range<String.Index>? in
Range(match.range(at: index), in: first)
}
}
for ranges in allRanges.reversed() {
first.replaceSubrange(ranges[0], with: first[ranges[1]] + first[ranges[2]].uppercased())
}
return (CollectionOfOne(first) + comps.dropFirst())
.joined(separator: " = ")
}
let finalString = result.joined(separator: "\n")
print(finalString)
This will print
"contentsListHorizontalMore" = "totall";
"whateverVerticalLess" = "summ";

You could subclass NSRegularExpression and override replacementString to be able to modify the string represented by the template parameter.
class CapitalizeRegex: NSRegularExpression {
override func replacementString(for result: NSTextCheckingResult, in string: String, offset: Int, template templ: String) -> String {
guard result.numberOfRanges == 2,
let range = Range(result.range(at: 1), in: string) else { return "" }
return string[range].capitalized
}
}
Then search for a dot followed by a word and capture the latter. The $1 pattern will capitalize the word
let string = #"contentsList.horizontal.more" = "totall";"#
let regex = try! CapitalizeRegex(pattern: #"\.(\b\w+\b)"#)
let result = regex.stringByReplacingMatches(in: string,
range: NSRange(string.startIndex..., in: string),
withTemplate: "$1")
print(result)

How to do Inverse Regex match [regex negation] in Swift?

Is there a way to do inverse regular expression match and retrieve the un-matching string as return value in iOS Swift?
Let's say,
Input string: "232#$%4lion"
Regex Pattern: "[a-z]{4}"
Normal match Output: "lion"
Inverse match output: "232#$%4" (Expected result)
Please find the normal regex matching swift code below.
func regexMatch() {
let str = "232#$%4lion"
let regex = try! NSRegularExpression(pattern: "[a-z]{4}", options: .caseInsensitive)
let firstMatch = regex.firstMatch(in: str, options: .reportProgress, range: NSMakeRange(0, str.count))
guard let matches = firstMatch else {
print("No Match")
return
}
if matches.numberOfRanges > 0 {
let outputRange = matches.range(at: 0)
let startIndex = str.index(str.startIndex, offsetBy: outputRange.lowerBound)
let endIndex = str.index(str.startIndex, offsetBy: outputRange.upperBound)
print("Matched String: \(str[startIndex..<endIndex])")
}
}
I can somehow manipulate the matching result and then, can retrieve the inverse matching string by manipulating range operations. Instead of doing that,
I want to know if inverse matching can be done using regex pattern itself in Swift and directly retrieve un-matching pattern from the input string.
Above input and output values are for reference purpose only. Just to keep the question simple. Actual values in real time cases can be complex. Please answer with a generic solution.

You may use your regex to remove (replace with an empty string) the matches found:
let result = str.replacingOccurrences(of: "[a-z]{4}", with: "", options: .regularExpression)
Swift test:
let str = "232#$%4lion"
let result = str.replacingOccurrences(of: "[a-z]{4}", with: "", options: .regularExpression)
print(result)
Output: 232#$%4.

There is no need to use a regular expression for that. You can just use filter.
RangeReplaceableCollection has a filter instance method that returns Self, String conforms to RangeReplaceableCollection, so filter when used with a String returns another String.
You can combine it with the new Character property isLetter (Swift5) and create a predicate negating that property.
let str = "232#$%4lion"
let result = str.filter { !$0.isLetter }
print(result) // "232#$%4"

Either use this pattern, however it doesn't invert {4} because it's not predictable
"[^a-z]+"
or create a new mutable string from str and remove the found range
func invertedRegexMatch() {
let str = "232#$%4lion"
let regex = try! NSRegularExpression(pattern: "[a-z]{4}", options: .caseInsensitive)
let firstMatch = regex.firstMatch(in: str, options: .reportProgress, range: NSRange(str.startIndex..., in: str))
guard let matches = firstMatch else {
print("No Match")
return
}
if matches.numberOfRanges > 0 {
let outputRange = Range(matches.range(at: 0), in: str)!
var invertedString = str
invertedString.removeSubrange(outputRange)
print("Matched String: \(invertedString)")
}
}
Note: Use always the dedicated API to create NSRange from Range<String.Index> and vice versa
But if you just want to remove all letters there is a much simpler way
var str = "232#$%4lion"
str.removeAll{$0.isLetter}

How to remove specific characters or words from a string in swift?

var myString = "43321 This is example hahaha 4356-13"
And I want result that
var resultString = "This is example"
In other words, I want to erase certain words and numbers.
How is it possible?
I've already done a search, But I could not find exactly what I wanted.
Please answer me. Thanks
Okay, To be precise, I want to erase all numbers and erase only certain words I want.
I made a big mistake. I am so sorry. To be precise, it is correct to erase numbers and certain words. However, I should not always delete all digits, but keep them if number's are next to a string.
example)
let testString1 = "123123123123"
let testString2 = "apple 65456876"
let testString3 = "apple banana3 horse5"
let testString4 = "44 apple banana1banana 5horse"
let testString5 = "123 banana123 999"
And I want remove words "apple".
So result is
let resultString1 = ""
let resultString2 = ""
let resultString3 = "banana3 horse5"
let resultString4 = "banana1banana 5horse"
let resultString5 = "banana123"

You can try applying below code,
var sentence = "43321 This is example hahaha 4356-13"
sentence = (sentence.components(separatedBy: NSCharacterSet.decimalDigits) as NSArray).componentsJoined(by: "")
let wordToRemove = "hahaha"
if let range = sentence.range(of: wordToRemove) {
sentence.removeSubrange(range)
}
print(sentence) // This is example -

I think this will be faster with a regex solution:
//use NSMutableString so the regex.replaceMatches later will work
var myString:NSMutableString = "43321 This is example hahaha 4356-13"
//add more words to match by using | operator e.g. "[0-9]{1,}|apple|orange"
//[0-9]{1,} simply removes all numbers but is more efficient than [0-9]
let regexPattern = "[0-9]{1,}|apple"
//caseInsensitive or the regex will be much longer
let regex = try NSRegularExpression(pattern: regexPattern, options: .caseInsensitive)
var matches = regex.matches(in: myString as String, options: .withoutAnchoringBounds, range: range)
regex.replaceMatches(in: myString, options: .withoutAnchoringBounds, range: range, withTemplate: "")
print(myString) // This is example hahaha -
Subsequent Strings
var testString3: NSMutableString = "apple banana horse"
matches = regex.matches(in: testString3 as String, options: .withoutAnchoringBounds, range: range)
regex.replaceMatches(in: testString3, options: .withoutAnchoringBounds, range: range, withTemplate: "")
print(testString3) // banana horse

You can try this :
let myString = "43321 This is example hahaha 4356-13"
let stringToReplace = "43321"
let outputStr = myString.replacingOccurrences(of: stringToReplace, with: "")
print(outputStr.trimmingCharacters(in: NSCharacterSet.whitespaces))
//output: This is example hahaha 4356-13

Use this string extension .
extension String
{
func removeNumbersAString(str:String)->String
{
var aa = self.components(separatedBy: CharacterSet.decimalDigits).joined(separator: " ")
let regexp = " \\d* "
var present = self
while present.range(of:regexp, options: .regularExpression) != nil {
if let range = present.range(of:regexp, options: .regularExpression) {
let result = present.substring(with:range)
present = present.replacingOccurrences(of: result, with: "")
// print("before \(result)")
}
}
return present
}
}
A test
var str = "dsghdsghdghdhgsdghghdghds 12233 apple"
print("before \(str)") /// before dsghdsghdghdhgsdghghdghds 12233 apple
print("after \(str.removeNumbersAString(str: "apple"))") /// after dsghdsghdghdhgsdghghdghds

Try bellow code:
var myString = "43321 This is example hahaha 4356-13"
//to remove numbers
myString = (myString.components(separatedBy: NSCharacterSet.decimalDigits) as NSArray).componentsJoined(by: "")
// to remove specific word
if let range = myString.range(of: "hahaha") {
myString.removeSubrange(range)
}
print(myString)

Searching for strings starting with \n\n in Swift

Hey I have a requirement to increase the spacing in my UILables for double spaced line breaks. I want to search my string and find all the strings starting with \n\n. For example "Hello world\nI am on the next line\n\nNow I'm on the next line and it's spaced more than before\nNow I'm back to normal spacing". I'm having trouble trying to figure out the regex for this. I am trying:
let regExRule = "^\n\n*"
and passing it into this function:
func matchesForRegexInText(regex: String, text: String) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex, options: [])
let nsString = text as NSString
let results = regex.matchesInString(text,
options: [], range: NSMakeRange(0, nsString.length))
return results.map { nsString.substringWithRange($0.range)}
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return []
}
}
However I am getting an empty array. Not really sure how to construct the regex pattern for this. Any pointers would be really appreciated. Thanks!

The primary issue I see is the regex pattern should include a capture group to select the multiple strings needed.
func matchesForRegexInText(regex : String, text: String) -> [String] {
var captured = [String]()
let exp = try! NSRegularExpression(pattern: regex, options: [])
let matches = exp.matchesInString(text, options:[], range: NSMakeRange(0, text.characters.count))
for match in matches {
let c = (text as NSString).substringWithRange(match.rangeAtIndex(1))
captured.append(c)
}
return captured
}
let re = "\\n\\n([\\w\\\\s,']+)"; // selection with (...)
// ["Alpha", "Bravo", "Charlie"]
let strResults = matchesForRegexInText(re, text: "\n\nAlpha\n\nBravo\n\nCharlie\n\n")

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

optional capture groups with NSRegularExpressions in swift - ios

Related

Swift 5: Filter a string of characters, getting only the numbers with a condition

how can I edit lots of swift string at once?

How to do Inverse Regex match [regex negation] in Swift?

How to remove specific characters or words from a string in swift?

Searching for strings starting with \n\n in Swift

Categories

Resources