I have the following regex string used for determining a valid email address (including special characters e.g. ö, ê, ī, etc):
^(([^<>()[\]\\.,;:\s#\"]+(\.[^<>()[\]\\.,;:\s#\"]+)*)|(\".+\"))#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$
It's tested on https://regex101.com as working regex. I then want to include this string in my code so I have to escape it. I therefore ended up with the following string:
^(([^<>()[\\]\\\\.,;:\\s#\\\"]+(\\.[^<>()[\\]\\\\.,;:\\s#\\\"]+)*)|(\\\".+\\\"))#((\\[[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\])|(([a-zA-Z\\-0-9]+\\.)+[a-zA-Z]{2,}))$
Now, when I run my code:
private static func regexMatch(regex: String, string: String) -> Bool {
let stringTest = NSPredicate(format:"SELF MATCHES %#", regex)
return stringTest.evaluateWithObject(string)
}
My app crashes with the following error:
*** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'Can't do regex matching, reason: Can't open pattern U_REGEX_MISSING_CLOSE_BRACKET (string scött.hôdśōn#example.com, pattern ^(([^<>()[]\.,;:\s#\"]+(.[^<>()[]\.,;:\s#\"]+)*)|(\".+\"))#(([[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}])|(([a-zA-Z-0-9]+.)+[a-zA-Z]{2,}))$, case 0, canon 0)'
My guess is I am somehow escaping the regex string incorrectly. Can somebody point me in the right direction?
The comments about actually improving your regex should probably be heeded. There were a few things you were not escaping when you should, specifically ] and escaping when you didnt need to, specifically . inside [] and ". After fixing these in your regex
^(([^<>()[\].,;:\s#"]+(.[^<>()[\].,;:\s#"]+)*)|(".+"))#(([[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}])|(([a-zA-Z-0-9]+.)+[a-zA-Z]{2,}))$
And then escaping for special characters, we get
"/^(([^<>()[\\].,;:\\s#\"]+(.[^<>()[\\].,;:\\s#\"]+)*)|(\".+\"))#(([[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}])|(([a-zA-Z-0-9]+.)+[a-zA-Z]{2,}))$/"
I suspect the "missing bracket" error was caused by the improper escaping of ] which prematurely closed some alternator blocks you had. Again, the commentators on your post are correct that the regex itself could definitely be improved
I updated my regex string to the following and now it seems to be working.
^([^x00-\\\\x7F]|[\\w-\\.])+#((([^x00-\\\\x7F]|)[\\w-])+\\.)+[\\w-]{2,4}$
Related
I encountered an error in a script I was debugging because somebody had created a variable with a name matching a built-in function, rendering the function inaccessible. I got strange errors when I tried to use the function, like:
incorrect arguments for (-)
incorrect arguments for (by)
incorrect arguments for ([)
incorrect arguments for (=)
Example code:
int length
// ...
// ...
string substr
string str = "big long string with lots of text"
substr = str[0:length(str)-2]
Is there a way to access the original length() function in this situation? I was actually just trying to add debug output to the existing script, not trying to modify the script, when I encountered this error.
For now I have just renamed the variable.
Well, in the case that you had no chance to modify the code, e.g. because it is encrypted you could do sth like
int length_original (string s) { return length s }
<<here is the code of your function>>
int length (string s) {return length_original s }
I have a swift function func hexStrToBytes(input: String) throws -> [UInt8] to translate HEX string into Uint8 array, which may need to throw exceptions when the format of input is not correct.
But I can't verify its format by myself, instead I use bytes.append(UInt8(str,radix: 16)!). If this function executes successfully, then my function can return a correct value. If something wrong with this bytes.append(UInt8(str,radix: 16)!), I need to throw a exception. But how do I know it's wrong? This program may just crush and I even don't have chance to throw exceptions.
This snippet should be close to what you're looking for. It uses optional binding to check if the string is convertible and throws an error if is not.
if let theByte = UInt8(str, radix: 16) {
bytes.append(theByte)
} else {
throw MyError()
}
Note that Swift doesn't use the term "exceptions" to describe its error-handling mechanisms. While the throw/catch control flow is similar, they are not implemented the same way as exceptions in other languages.
I'm familiar with doing pcre regexes, however they don't seem to work in swift.
^([1-9]\d{0,2}(\,\d{3})*|([1-9]\d*))(\.\d{2})?$
to validate numbers like 1,000,000.00
However, putting this in my swift function, causes an error.
extension String {
func isValidNumber() -> Bool {
let regex = NSRegularExpression(pattern: "^([1-9]\d{0,2}(\,\d{3})*|([1-9]\d*))(\.\d{2})?$", options: .CaseInsensitive, error: nil)
return regex?.firstMatchInString(self, options: nil, range: NSMakeRange(0, countElements(self))) != nil
}
}
"Invalid escape sequence in litteral"
This is of course, because pcre uses the "\" character, which swift interprets as an escape (I believe?)
So since I can't just use the regexes I'm used to. How do I translate them to be compatible with Swift code?
Within double quotes, a single backslash would be readed as an escape sequence. You need to escape all the backslashes one more time in-order to consider it as a regex backslash character.
"^([1-9]\\d{0,2}(,\\d{3})*|([1-9]\\d*))(\\.\\d{2})?$"
Edit (June 2022)
From Swift 5.7, which you can use on Xcode 14.0 Beta 1 or later, you can use /.../ like this:
// Regex type
let regex = /^([1-9]\d{0,2}(\,\d{3})*|([1-9]\d*))(\.\d{2})?$/
Edit (Dec 2022): Since this internally creates Regex introduced in iOS 16 and macOS 13, the minimum deployment target must cover that OS version.
Advantages over #"..."#:
Your regex pattern is parsed at the compile-time, so you don't need to worry if your pattern is valid or not once your program is compiled
If your pattern is invalid, the compiler lets you know specifically which part is invalid as the compile error
Syntax highlighting is applied
So your code would look like this:
extension String {
func isValidNumber() -> Bool {
let regex = /^([1-9]\d{0,2}(\,\d{3})*|([1-9]\d*))(\.\d{2})?$/
.ignoresCase()
return (try? regex.firstMatch(in: self)) != nil
}
}
Original answer
Since Swift 5, you can use #"..."# like this, so that you don't need to add extra escape sequences for Swift:
#"^([1-9]\d{0,2}(\,\d{3})*|([1-9]\d*))(\.\d{2})?$"#
My program checks if an NSError object exists, and sends it to another method, like this:
if([response isEqualToString:#""]) {
[self handleError:commandError];
}
In handleError:, I try checking the localized description against an expected string like this:
-(void)handleError:(NSError*)error
{
NSString* errorDescription = [error localizedDescription];
NSLog(#"%#",errorDescription); //works fine
if([errorDescription isEqualToString:#"sudo: no tty present and no askpass program specified"]) {
NSLog(#"SO Warning: Attempted to execute sudo command");
}
}
However, the if statement isn't firing. The log outputs precisely the same thing I typed out in the if statement.
Unless you seriously think the If statement structure of iOS is broken, or the isEqualToString method implementation is broken, then the strings aren't the same and there is no mystery:
What you typed out is either using different characters (see: unicode and character encoding types) or there are invisible/nonprinting characters in your log output that you're not typing because you can't see them.
I'd suggest looping through the characters in your string and printing out the byte code values:
for (i=0 to length of string) : print [errorDescription characterAtIndex:i];
You'll find that the byte code sequence of the string you typed is not equal to the byte code sequence returned by localizedDescription method.
As others have said, basing program logic on exact character strings you don't control and which can change without notice is likely not an optimum solution here. What about error codes?
I would suggest using error codes, since you're using a library over which you have no control, usually the exposed interface should tell you what are the error codes associated to every type of expected errors. Using error code would make your code stronger, clear and string independent.
Anyway if you would prefer to continue comparing the strings values, because you have a good reason to do so, I'd suggest being aware of possible punctuation, formatting characters such as newlines for example, or lowercase / uppercase letters .
I was under impression that in F# the following two lines are supposed to give identical results:
let a = string v
let a = v.ToString()
It is implied that v is an object. It turns out that if v is a System.Guid the first line just throws an exception:
System.FormatException occurred
Message="Format String can be only \"D\", \"d\", \"N\", \"n\", \"P\", \"p\", \"B\" or \"b\"."
Source="mscorlib"
StackTrace:
at System.Guid.ToString(String format, IFormatProvider provider)
InnerException:
I can certainly deal with Guids separately, the question is what other objects will give me the same trouble? Should I avoid using the string operator at all?
In my case the object potentially can be anything
This is a bug that is (will be) fixed in the next release.
(In general, it should work; the bug is because System.Guid does not respond to the IFormattable "G" specifier, despite the fact that the docs for IFormattable say that all implementers must implement the "G" specifier. So it's actually kinda a bug in System.Guid, but the F# library will work around this bug in its 'string' operator in the next release.
In short, you can use this operator safely, except for Guid right now, but that will be fixed soon. In the meantime you can special-case Guid.)