regex validation - grails constraints - grails

I'm pretty new on grails, I'm having a problem in matches validation using regex. What I wanted to happen is my field can accept a combination of alphanumeric and specific special characters like period (.), comma (,) and dash (-), it may accept numbers (099) or letters only (alpha) , but it won't accept input that only has special characters (".-,"). Is it possible to filter this kind of input using regex?
please help. Thank you for sharing your knowledge.

^[0-9a-zA-Z,.-]*?[0-9a-zA-Z]+?[0-9a-zA-Z,.-]*$
meaning:
/
^ beginning of the string
[...]*? 0 or more characters from this class (lazy matching)
[...]+? 1 or more characters from this class (lazy matching)
[...]* 0 or more characters from this class
$ end of the string
/

I think you could match that with a regular expression like this:
".*[0-9a-zA-Z.,-]+.*"
That means:
"." Begin with any character
"*" Have zero or more of these characters
"[0-9a-zA-Z.,-]" Have characters in the range 0-9, a-z, etc, or . or , or -
"+" Have one or more of this kind of character (so it's mandatory to have one in this set)
"." End with any character
"*" Have zero or more of these characters
This is working ok for me, hope it helps!

Related

Validate name to have no tabs or backslashes - Rails [duplicate]

I need a regular expression able to match everything but a string starting with a specific pattern (specifically index.php and what follows, like index.php?id=2342343).
Regex: match everything but:
a string starting with a specific pattern (e.g. any - empty, too - string not starting with foo):
Lookahead-based solution for NFAs:
^(?!foo).*$
^(?!foo)
Negated character class based solution for regex engines not supporting lookarounds:
^(([^f].{2}|.[^o].|.{2}[^o]).*|.{0,2})$
^([^f].{2}|.[^o].|.{2}[^o])|^.{0,2}$
a string ending with a specific pattern (say, no world. at the end):
Lookbehind-based solution:
(?<!world\.)$
^.*(?<!world\.)$
Lookahead solution:
^(?!.*world\.$).*
^(?!.*world\.$)
POSIX workaround:
^(.*([^w].{5}|.[^o].{4}|.{2}[^r].{3}|.{3}[^l].{2}|.{4}[^d].|.{5}[^.])|.{0,5})$
([^w].{5}|.[^o].{4}|.{2}[^r].{3}|.{3}[^l].{2}|.{4}[^d].|.{5}[^.]$|^.{0,5})$
a string containing specific text (say, not match a string having foo):
Lookaround-based solution:
^(?!.*foo)
^(?!.*foo).*$
POSIX workaround:
Use the online regex generator at www.formauri.es/personal/pgimeno/misc/non-match-regex
a string containing specific character (say, avoid matching a string having a | symbol):
^[^|]*$
a string equal to some string (say, not equal to foo):
Lookaround-based:
^(?!foo$)
^(?!foo$).*$
POSIX:
^(.{0,2}|.{4,}|[^f]..|.[^o].|..[^o])$
a sequence of characters:
PCRE (match any text but cat): /cat(*SKIP)(*FAIL)|[^c]*(?:c(?!at)[^c]*)*/i or /cat(*SKIP)(*FAIL)|(?:(?!cat).)+/is
Other engines allowing lookarounds: (cat)|[^c]*(?:c(?!at)[^c]*)* (or (?s)(cat)|(?:(?!cat).)*, or (cat)|[^c]+(?:c(?!at)[^c]*)*|(?:c(?!at)[^c]*)+[^c]*) and then check with language means: if Group 1 matched, it is not what we need, else, grab the match value if not empty
a certain single character or a set of characters:
Use a negated character class: [^a-z]+ (any char other than a lowercase ASCII letter)
Matching any char(s) but |: [^|]+
Demo note: the newline \n is used inside negated character classes in demos to avoid match overflow to the neighboring line(s). They are not necessary when testing individual strings.
Anchor note: In many languages, use \A to define the unambiguous start of string, and \z (in Python, it is \Z, in JavaScript, $ is OK) to define the very end of the string.
Dot note: In many flavors (but not POSIX, TRE, TCL), . matches any char but a newline char. Make sure you use a corresponding DOTALL modifier (/s in PCRE/Boost/.NET/Python/Java and /m in Ruby) for the . to match any char including a newline.
Backslash note: In languages where you have to declare patterns with C strings allowing escape sequences (like \n for a newline), you need to double the backslashes escaping special characters so that the engine could treat them as literal characters (e.g. in Java, world\. will be declared as "world\\.", or use a character class: "world[.]"). Use raw string literals (Python r'\bworld\b'), C# verbatim string literals #"world\.", or slashy strings/regex literal notations like /world\./.
You could use a negative lookahead from the start, e.g., ^(?!foo).*$ shouldn't match anything starting with foo.
You can put a ^ in the beginning of a character set to match anything but those characters.
[^=]*
will match everything but =
Just match /^index\.php/, and then reject whatever matches it.
In Python:
>>> import re
>>> p='^(?!index\.php\?[0-9]+).*$'
>>> s1='index.php?12345'
>>> re.match(p,s1)
>>> s2='index.html?12345'
>>> re.match(p,s2)
<_sre.SRE_Match object at 0xb7d65fa8>
Came across this thread after a long search. I had this problem for multiple searches and replace of some occurrences. But the pattern I used was matching till the end. Example below
import re
text = "start![image]xxx(xx.png) yyy xx![image]xxx(xxx.png) end"
replaced_text = re.sub(r'!\[image\](.*)\(.*\.png\)', '*', text)
print(replaced_text)
gave
start* end
Basically, the regex was matching from the first ![image] to the last .png, swallowing the middle yyy
Used the method posted above https://stackoverflow.com/a/17761124/429476 by Firish to break the match between the occurrence. Here the space is not matched; as the words are separated by space.
replaced_text = re.sub(r'!\[image\]([^ ]*)\([^ ]*\.png\)', '*', text)
and got what I wanted
start* yyy xx* end

Regex for letters, numbers, dashes only?

I am trying to validate a second level domain (everything before the .com and after the https://) in Ruby so that I can pass it into my namecheap api requests. Here is what I have so far, but I am not familiar with regex
validates_format_of :sld, with: [a-zA-Z0-9-]
no spaces allowed
no special characters allowed
however, dashes are allowed
cannot start with a dash
cannot end with a dash
I know that uppercase characters do not work in domain names, but I don't want to make users enter their text again. I will downcase the user input and show a flash message on the next page.
How about
validates_format_of :sld, with: /\A[a-z\d][a-z\d-]*[a-z\d]\z/i
Explanation:
\A - match beginning of string
[a-z\d] - match any letter from a-z or number from 0-9 once
[a-z\d-] - match any letter from a-z, number from 0-9, or dash zero or more times
[a-z\d] - match any letter from a-z or number from 0-9 once
\z - match end of string
i flag - make matches case-insensitive
Note: this will only work for strings of length 2 or more. If you need to support single-character inputs,
I would just write a method that checks the string length and if it's a single character, ensure it's not a dash. If it's more than 2 characters, validate it with this regex.
This will probably work:
^[0-9A-Za-z](|[-0-9A-Za-z]{0,61}[0-9A-Za-z])$
Your string needs to start with a alphanumeric ([0-9A-Za-z])
Then, there are two choices ((|[-0-9A-Za-z]{0,61}[0-9A-Za-z])):
End of string
Between 0 and 61 alphanumeric or dash chars followed by an alphanumeric char. (For a maximum of 63 characters)
^ and $ are anchors
validates :sld, format: { with: /^(?!-)[-\w\d]{,63}(?<!-)$/i }
You can try out your regex at http://rubular.com/
^(?!-) - negative lookahead: cannot start with dash
[-\w\d] - match words \w, digits \d, or dash -
{,63} - match must be between 1 and 63 characters
(?<!-)$ - negative lookbehind: cannot end with dash
/i - case insensitive

What does this pattern ^[%w-.]+$ mean in Lua?

Just came across this pattern, which I really don't understand:
^[%w-.]+$
And could you give me some examples to match this expression?
Valid in Lua, where %w is (almost) the equivalent of \w in other languages
^[%w-.]+$ means match a string that is entirely composed of alphanumeric characters (letters and digits), dashes or dots.
Explanation
The ^ anchor asserts that we are at the beginning of the string
The character class [%w-.] matches one character that is a letter or digit (the meaning of %w), or a dash, or a period. This would be the equivalent of [\w-.] in JavaScript
The + quantifier matches such a character one or more times
The $ anchor asserts that we are at the end of the string
Reference
Lua Patterns
Actually it will match nothing. Because there is an error: w- this is a start of a text range and it is out of order. So it should be %w\- instead.
^[%w\-.]+$
Means:
^ assert position at start of the string
[%w\-.]+ match a single character present in the list below
+ Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
%w a single character in the list %w literally (case sensitive)
\- matches the character - literally
. the literal character .
$ assert position at end of the string
Edit
As the OP changed the question and the tags this answer no longer fits as a proper answer. It is POSIX based answer.
As #zx81 comment:
%w is \w in Lua which means any alphanumeric characters plus "_"

How to update this REGEX to make sure string does not have _(underscore) at the end or beigning

This is the regular expression which i have, i need to make sure that string does not start or end with underscore , underscore may appear in between.
/^[a-zA-Z0-9_.-]+$/
I have tried
(?!_)
But doesn't seem to work
Allowed strings:
abcd
abcd_123
Not allowed strings:
abcd_
_abcd_123
Not too hard!
/^[^_].*[^_]$/
"Any character except an underscore at the start of the line (^[^_]), then any characters (.*), then any character except an underscore before the end of the line ([^_]$)."
This does require at least two characters to validate the string. If you want to allow one character lines:
/^[^_](.*[^_]|)$/
"Anything except an underscore to start the line, and then either some characters plus a non-underscore character before end-of-line, or just an immediate end-of-line.
You could approach this in the inverse way,
Check all those that do match starting and ending underscores like this:
/^_|_$/
^_ #starts with underscore
| #OR
_$ #ends with underscore
And then eliminate those that match. The above regexp is much more easier to read.
Check : http://www.rubular.com/r/H3Axvol13b
Or you can try the longer regex:
/^[a-zA-Z0-9.-][a-zA-Z0-9_.-]*[a-zA-Z0-9.-]$|^[a-zA-Z0-9.-]+$|^[a-zA-Z0-9.-][a-zA-Z0-9.-]$/
^[a-zA-Z0-9.-] #starts with a-z, or A-Z, or 0-9, or . -
[a-zA-Z0-9_.-]* #anything that can occur and the underscore
[a-zA-Z0-9.-]$ #ends with a-z, or A-Z, or 0-9, or . -
| #OR
^[a-zA-Z0-9.-]$ #for one-letter words
| #OR
^[a-zA-Z0-9.-][a-zA-Z0-9.-]$ #for two letter words
Check: http://www.rubular.com/r/FdtCqW6haG
/^[a-zA-Z0-9.-][a-zA-Z0-9_.-]+[a-zA-Z0-9.-]$/
Try this
Description:
In the first section, [a-zA-Z0-9.-], regex only allows lower and upper case alphabets, digits, dot and hyphen.
In the next section, [a-zA-Z0-9_.-]+, regex looks for a single or more than one characters that are lower or upper case alphabets, digits dot, hyphen or an underscore.
The last part, [a-zA-Z0-9.-], is the same as the first part that restricts the input to end with an underscore.
Try this:
Recently had the same concern and this is how I did it.
// '"^[a-zA-Z0-9_.-]*$"' → Alphanumeric and 「.」「_」「-」
// "^[^_].*[^_]$" → Reject start and end of string if contains 「_」
// (?=) REGEX AND operator
SLUG_REGEX = '"(?=^[a-zA-Z0-9_.-]*$)(?=^[^_].*[^_]$)"';
I used this snippet for my Laravel Validation so you may need to change the code as needed like " to / based on your code sample and other answers' code.

username regex in rails

I am trying to find a regex to limit what a person can use for a username on my site. I don't need to have it check to see how many characters there are in it, as another validation does this. Basically all I need to make it do is make sure that it allows: letters (capital and lowercase) numbers, dashes and underscores.
I came across this: /^[-a-z]+$/i
But it doesn't seem to allow numbers.
What am I missing?
The regex you're looking for is
/\A[a-z0-9\-_]+\z/i
Meaning one or more characters of range a-z, range 0-9, - (needs to be escaped with a backslash) and _, case insensitive (the i qualifier)
Use
/\A[\w-]+\z$/
\w is shorthand for letters, digits and underscore.
\A matches at the start of the string, \z matches at the end of the string. These tokens are called anchors, and Ruby is a bit special with regard to them: Most regex engines use ^ and $ as start/end-of-string anchors by default, whereas in Ruby they can also match at the start/end of lines (which matters if you're working with multiline strings). Therefore, it's safer (as #JustMichael pointed out) to use \A and \z because there is no such ambiguity.
Your regular expression contains a character class [-a-z] that allows the characters - (dash) and a through z. In order to expand the range of characters allowed by this character class, you will need to add more characters within the [].
Please see Character Classes or Character Sets for further information and examples.

Resources