Is there a way to do inverse regular expression match and retrieve the un-matching string as return value in iOS Swift?
Let's say,
Input string: "232#$%4lion"
Regex Pattern: "[a-z]{4}"
Normal match Output: "lion"
Inverse match output: "232#$%4" (Expected result)
Please find the normal regex matching swift code below.
func regexMatch() {
let str = "232#$%4lion"
let regex = try! NSRegularExpression(pattern: "[a-z]{4}", options: .caseInsensitive)
let firstMatch = regex.firstMatch(in: str, options: .reportProgress, range: NSMakeRange(0, str.count))
guard let matches = firstMatch else {
print("No Match")
return
}
if matches.numberOfRanges > 0 {
let outputRange = matches.range(at: 0)
let startIndex = str.index(str.startIndex, offsetBy: outputRange.lowerBound)
let endIndex = str.index(str.startIndex, offsetBy: outputRange.upperBound)
print("Matched String: \(str[startIndex..<endIndex])")
}
}
I can somehow manipulate the matching result and then, can retrieve the inverse matching string by manipulating range operations. Instead of doing that,
I want to know if inverse matching can be done using regex pattern itself in Swift and directly retrieve un-matching pattern from the input string.
Above input and output values are for reference purpose only. Just to keep the question simple. Actual values in real time cases can be complex. Please answer with a generic solution.
You may use your regex to remove (replace with an empty string) the matches found:
let result = str.replacingOccurrences(of: "[a-z]{4}", with: "", options: .regularExpression)
Swift test:
let str = "232#$%4lion"
let result = str.replacingOccurrences(of: "[a-z]{4}", with: "", options: .regularExpression)
print(result)
Output: 232#$%4.
There is no need to use a regular expression for that. You can just use filter.
RangeReplaceableCollection has a filter instance method that returns Self, String conforms to RangeReplaceableCollection, so filter when used with a String returns another String.
You can combine it with the new Character property isLetter (Swift5) and create a predicate negating that property.
let str = "232#$%4lion"
let result = str.filter { !$0.isLetter }
print(result) // "232#$%4"
Either use this pattern, however it doesn't invert {4} because it's not predictable
"[^a-z]+"
or create a new mutable string from str and remove the found range
func invertedRegexMatch() {
let str = "232#$%4lion"
let regex = try! NSRegularExpression(pattern: "[a-z]{4}", options: .caseInsensitive)
let firstMatch = regex.firstMatch(in: str, options: .reportProgress, range: NSRange(str.startIndex..., in: str))
guard let matches = firstMatch else {
print("No Match")
return
}
if matches.numberOfRanges > 0 {
let outputRange = Range(matches.range(at: 0), in: str)!
var invertedString = str
invertedString.removeSubrange(outputRange)
print("Matched String: \(invertedString)")
}
}
Note: Use always the dedicated API to create NSRange from Range<String.Index> and vice versa
But if you just want to remove all letters there is a much simpler way
var str = "232#$%4lion"
str.removeAll{$0.isLetter}
Related
In my project, I have Localizable.string file which is having more than 10,000 lines keyValue format.
I need to convert all of keys which are dotCase format like "contentsList.sort.viewCount" to lowerCamelCase. how can I convert by using swift scripting? thank you.
as-is
"contentsList.horizontal.more" = "totall";
to-be
"contentsListHorizontalMore" = "totall";
First get all lines from your string. CompactMap your lines breaking it up into two components separated by the equal sign. Get the first component otherwise return nil. Get all ranges of the regex (\w)\.(\w). Replace the match range by the first + second group capitalized. This will remove the period. Return a collection of one element (snake case) + the other components joined by the separator equal sign. Now that you have all lines you just need to join them by the new line character:
let string = """
"contentsList.horizontal.more" = "totall";
"whatever.vertical.less" = "summ";
"""
let pattern = #"(\w)\.(\w)"#
let lines = string.split(omittingEmptySubsequences: false,
whereSeparator: \.isNewline)
let result: [String] = lines.compactMap {
let comps = $0.components(separatedBy: " = ")
guard var first = comps.first else { return nil }
let regex = try! NSRegularExpression(pattern: pattern)
let matches = regex.matches(in: first, range: NSRange(first.startIndex..., in: first))
let allRanges: [[Range<String.Index>]] = matches.map { match in
(0..<match.numberOfRanges).compactMap { (index: Int) -> Range<String.Index>? in
Range(match.range(at: index), in: first)
}
}
for ranges in allRanges.reversed() {
first.replaceSubrange(ranges[0], with: first[ranges[1]] + first[ranges[2]].uppercased())
}
return (CollectionOfOne(first) + comps.dropFirst())
.joined(separator: " = ")
}
let finalString = result.joined(separator: "\n")
print(finalString)
This will print
"contentsListHorizontalMore" = "totall";
"whateverVerticalLess" = "summ";
You could subclass NSRegularExpression and override replacementString to be able to modify the string represented by the template parameter.
class CapitalizeRegex: NSRegularExpression {
override func replacementString(for result: NSTextCheckingResult, in string: String, offset: Int, template templ: String) -> String {
guard result.numberOfRanges == 2,
let range = Range(result.range(at: 1), in: string) else { return "" }
return string[range].capitalized
}
}
Then search for a dot followed by a word and capture the latter. The $1 pattern will capitalize the word
let string = #"contentsList.horizontal.more" = "totall";"#
let regex = try! CapitalizeRegex(pattern: #"\.(\b\w+\b)"#)
let result = regex.stringByReplacingMatches(in: string,
range: NSRange(string.startIndex..., in: string),
withTemplate: "$1")
print(result)
Can anyone give me a Swift regex to identify consecutive characters in a string?
My regex is .*(.)\\1$ and this is not working. My code block is;
let regex = ".*(.)\\1$"
return NSPredicate(format: "SELF MATCHES %#", regex).evaluate(with: string)
Examples:
abc123abc -> should be valid
abc11qwe or aa12345 -> should not be valid because of 11 and aa
Thanks
This regex may help you, (Identifies consecutive repeating characters - It validates and satisfies matches with samples you've shared. But you need to test other possible scenarios for input string.)
(.)\\1
Try this and see:
let string = "aabc1123abc"
//let string = "abc123abc"
let regex = "(.)\\1"
if let range = string.range(of: regex, options: .regularExpression) {
print("range - \(range)")
}
// or
if string.range(of: regex, options: .regularExpression) != nil {
print("found consecutive characters")
}
Result:
Use NSRegularExpression instead of NSPredicate
let arrayOfStrings = ["abc11qwe","asdfghjk"]
for string in arrayOfStrings {
var result = false
do{
let regex = try NSRegularExpression(pattern: "(.)\\1", options:[.dotMatchesLineSeparators]).firstMatch(in: string, range: NSMakeRange(0,string.utf16.count))
if((regex) != nil){
result = true
}
}
catch {
}
debugPrint(result)
}
How do I remove, not decode, percent-escaped characters from a string using Swift. For instance:
"hello%20there"
should become
"hellothere"
EDIT:
I would like to replace multiple percent-escaped characters in a string. So:
"hello%20there%0Dperson"
should become
"hellothereperson"
let string = originalString.replacingOccurrences(of: "%[0-9a-fA-F]{2}",
with: "",
options: .regularExpression,
range: nil)
You can use the method "removingPercentEncoding"
let precentEncodedString = "hello%20there%0Dperson"
let decodedString = precentEncodedString.removingPercentEncoding ?? ""
You can use regex for that matching % followed by two numbers: %[0-9a-fA-F]{2}
let myString = "hello%20there%0D%24person"
let regex = try! NSRegularExpression(pattern: "%[0-9a-fA-F]{2}", options: [])
let range = NSMakeRange(0, myString.characters.count)
let modString = regex.stringByReplacingMatchesInString(myString,
options: [],
range: range,
withTemplate: "")
print(modString)
let input:String = "hello%20there%0Dperson"
guard let output = input.stringByRemovingPercentEncoding else{
NSLog("failed to remove percent encoding")
return
}
NSLog(output)
and the result is
hello there
person
then you can just remove the spaces
or you can remove it by regex
"%([0-9a-fA-F]{2})"
Hey I have a requirement to increase the spacing in my UILables for double spaced line breaks. I want to search my string and find all the strings starting with \n\n. For example "Hello world\nI am on the next line\n\nNow I'm on the next line and it's spaced more than before\nNow I'm back to normal spacing". I'm having trouble trying to figure out the regex for this. I am trying:
let regExRule = "^\n\n*"
and passing it into this function:
func matchesForRegexInText(regex: String, text: String) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex, options: [])
let nsString = text as NSString
let results = regex.matchesInString(text,
options: [], range: NSMakeRange(0, nsString.length))
return results.map { nsString.substringWithRange($0.range)}
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return []
}
}
However I am getting an empty array. Not really sure how to construct the regex pattern for this. Any pointers would be really appreciated. Thanks!
The primary issue I see is the regex pattern should include a capture group to select the multiple strings needed.
func matchesForRegexInText(regex : String, text: String) -> [String] {
var captured = [String]()
let exp = try! NSRegularExpression(pattern: regex, options: [])
let matches = exp.matchesInString(text, options:[], range: NSMakeRange(0, text.characters.count))
for match in matches {
let c = (text as NSString).substringWithRange(match.rangeAtIndex(1))
captured.append(c)
}
return captured
}
let re = "\\n\\n([\\w\\\\s,']+)"; // selection with (...)
// ["Alpha", "Bravo", "Charlie"]
let strResults = matchesForRegexInText(re, text: "\n\nAlpha\n\nBravo\n\nCharlie\n\n")
I'm getting unicode scalar for emojis in a text string that I get from a server, which fail to show up as emojis when I print them in a UILabel. This is the format in which I get my string from server:
let string = "Hi, I'm lily U+1F609"
This doesn't work unless it's changed to
let string = "Hi, I'm lily \u{1F609}"
Is there anyway I can convert the string to the required format?
I don't want to use a regex to determine occurrences of U+<HEX_CODE> and then converting them to \u{HEX_CODE}. There has to be a better way of doing this.
This is the very kind of problems that regex was created for. If there's a simpler non-regex solution, I'll delete this answer:
func replaceWithEmoji(str: String) -> String {
var result = str
let regex = try! NSRegularExpression(pattern: "(U\\+([0-9A-F]+))", options: [.CaseInsensitive])
let matches = regex.matchesInString(result, options: [], range: NSMakeRange(0, result.characters.count))
for m in matches.reverse() {
let range1 = m.rangeAtIndex(1)
let range2 = m.rangeAtIndex(2)
if let codePoint = Int(result[range2], radix: 16) {
let emoji = String(UnicodeScalar(codePoint))
let startIndex = result.startIndex.advancedBy(range1.location)
let endIndex = startIndex.advancedBy(range1.length)
result.replaceRange(startIndex..<endIndex, with: emoji)
}
}
return result
}