A hero regular expression challenge

A hero regular expression challenge - grep

I've spent the whole afternoon trying to figure this out and I think you guys probably know the answer in 5 minutes.
I need to list occurrences of a string in a file like this.
look for (0 to many) of (a-z, A-Z, 0-9, _, -) followed by CHRIS or DAVE followed by (0 to many) of (a-z, A-Z, 0-9, _, -)
so for a file like this:
eader>fooCHRISbar</header>
madeup>DAVE123 more stuff after space
more stuff hereCHRISDAVE</done>
blah CHRIS.internet.com</done>
would return
fooCHRISbar
DAVE123
hereCHRISDAVE
CHRIS.internet.com
..basically it's looking for all occurences of CHRIS and DAVE including the surrounding text (a character, underscore or dash all the way up to a < or space or something

You need to add . in the second character class condition. So that it would match the string CHRIS.INTERNET.COM,
$ grep -oP '[\w-]*(?:CHRIS|DAVE)[\w.-]*' file
fooCHRISbar
DAVE123
hereCHRISDAVE
CHRIS.internet.com

Related

how to grep a word with only one single capital letter?

The txt file is :
bar
quux
kabe
Ass
sBo
CcdD
FGH
I would like to grep the words with only one capital letter in this example, but when I use "grep [A-Z]", it shows me all words with capital letters.
Could anyone find the "grep" solution here? My expected output is
Ass
sBo

grep '\<[a-z]*[A-Z][a-z]*\>' my.txt
will match lines in the ASCII text file my.txt if they contain at least one word consisting entirely of ASCII letters, exactly one of which is upper case.

You seem to have a text file with each word on its own line.
You may use
grep '^[[:lower:]]*[[:upper:]][[:lower:]]*$' file
See the grep online demo.
The ^ matches the start of string (here, line since grep operates on a line by lin basis by default), then [[:lower:]]* matches 0 or more lowercase letters, then an [[:upper:]] pattern matches any uppercase letter, and then [[:lower:]]* matches 0+ lowercase letters and $ asserts the position at the end of string.
If you need to match a whole line with exactly one uppercase letter you may use
grep '^[^[:upper:]]*[[:upper:]][^[:upper:]]*$' file
The only difference from the pattern above is the [^[:upper:]] bracket expression that matches any char but an uppercase letter. See another grep online demo.
To extract words with a single capital letter inside them you may use word boundaries, as shown in mathguy's answer. With GNU grep, you may also use
grep -o '\b[^[:upper:]]*[[:upper:]][^[:upper:]]*\b' file
grep -o '\b[[:lower:]]*[[:upper:]][[:lower:]]*\b' file
See yet another grep online demo.

Lua | String Pattern exclusion

So for I game I want the user to be able to do commands.
For simplicity all parameters are put into a table.
Example: "message all Hello" -> {"message","all","Hello"}
For that I've used the alphanumeric pattern (%w).
Problem is that characters like: _ ; : . Simply can not be used, since they're not alphanumeric.
Is there anyway to use the all characters pattern(.), but ignore spaces.
Or is there any better way to do it?
Thank you for your help

According to Lua docs:
Making the letter after the % uppercase inverts the class, so %D will match all non-digit characters.
So the pattern you're looking for is %S+.

What does this pattern ^[%w-.]+$ mean in Lua?

Just came across this pattern, which I really don't understand:
^[%w-.]+$
And could you give me some examples to match this expression?

Valid in Lua, where %w is (almost) the equivalent of \w in other languages
^[%w-.]+$ means match a string that is entirely composed of alphanumeric characters (letters and digits), dashes or dots.
Explanation
The ^ anchor asserts that we are at the beginning of the string
The character class [%w-.] matches one character that is a letter or digit (the meaning of %w), or a dash, or a period. This would be the equivalent of [\w-.] in JavaScript
The + quantifier matches such a character one or more times
The $ anchor asserts that we are at the end of the string
Reference
Lua Patterns

Actually it will match nothing. Because there is an error: w- this is a start of a text range and it is out of order. So it should be %w\- instead.
^[%w\-.]+$
Means:
^ assert position at start of the string
[%w\-.]+ match a single character present in the list below
+ Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
%w a single character in the list %w literally (case sensitive)
\- matches the character - literally
. the literal character .
$ assert position at end of the string
Edit
As the OP changed the question and the tags this answer no longer fits as a proper answer. It is POSIX based answer.
As #zx81 comment:
%w is \w in Lua which means any alphanumeric characters plus "_"

How to update this REGEX to make sure string does not have _(underscore) at the end or beigning

This is the regular expression which i have, i need to make sure that string does not start or end with underscore , underscore may appear in between.
/^[a-zA-Z0-9_.-]+$/
I have tried
(?!_)
But doesn't seem to work
Allowed strings:
abcd
abcd_123
Not allowed strings:
abcd_
_abcd_123

Not too hard!
/^[^_].*[^_]$/
"Any character except an underscore at the start of the line (^[^_]), then any characters (.*), then any character except an underscore before the end of the line ([^_]$)."
This does require at least two characters to validate the string. If you want to allow one character lines:
/^[^_](.*[^_]|)$/
"Anything except an underscore to start the line, and then either some characters plus a non-underscore character before end-of-line, or just an immediate end-of-line.

You could approach this in the inverse way,
Check all those that do match starting and ending underscores like this:
/^_|_$/
^_ #starts with underscore
| #OR
_$ #ends with underscore
And then eliminate those that match. The above regexp is much more easier to read.
Check : http://www.rubular.com/r/H3Axvol13b
Or you can try the longer regex:
/^[a-zA-Z0-9.-][a-zA-Z0-9_.-]*[a-zA-Z0-9.-]$|^[a-zA-Z0-9.-]+$|^[a-zA-Z0-9.-][a-zA-Z0-9.-]$/
^[a-zA-Z0-9.-] #starts with a-z, or A-Z, or 0-9, or . -
[a-zA-Z0-9_.-]* #anything that can occur and the underscore
[a-zA-Z0-9.-]$ #ends with a-z, or A-Z, or 0-9, or . -
| #OR
^[a-zA-Z0-9.-]$ #for one-letter words
| #OR
^[a-zA-Z0-9.-][a-zA-Z0-9.-]$ #for two letter words
Check: http://www.rubular.com/r/FdtCqW6haG

/^[a-zA-Z0-9.-][a-zA-Z0-9_.-]+[a-zA-Z0-9.-]$/
Try this
Description:
In the first section, [a-zA-Z0-9.-], regex only allows lower and upper case alphabets, digits, dot and hyphen.
In the next section, [a-zA-Z0-9_.-]+, regex looks for a single or more than one characters that are lower or upper case alphabets, digits dot, hyphen or an underscore.
The last part, [a-zA-Z0-9.-], is the same as the first part that restricts the input to end with an underscore.

Try this:
Recently had the same concern and this is how I did it.
// '"^[a-zA-Z0-9_.-]*$"' → Alphanumeric and 「.」「_」「-」
// "^[^_].*[^_]$" → Reject start and end of string if contains 「_」
// (?=) REGEX AND operator
SLUG_REGEX = '"(?=^[a-zA-Z0-9_.-]*$)(?=^[^_].*[^_]$)"';
I used this snippet for my Laravel Validation so you may need to change the code as needed like " to / based on your code sample and other answers' code.

regex validation - grails constraints

I'm pretty new on grails, I'm having a problem in matches validation using regex. What I wanted to happen is my field can accept a combination of alphanumeric and specific special characters like period (.), comma (,) and dash (-), it may accept numbers (099) or letters only (alpha) , but it won't accept input that only has special characters (".-,"). Is it possible to filter this kind of input using regex?
please help. Thank you for sharing your knowledge.

^[0-9a-zA-Z,.-]*?[0-9a-zA-Z]+?[0-9a-zA-Z,.-]*$
meaning:
/
^ beginning of the string
[...]*? 0 or more characters from this class (lazy matching)
[...]+? 1 or more characters from this class (lazy matching)
[...]* 0 or more characters from this class
$ end of the string
/

I think you could match that with a regular expression like this:
".*[0-9a-zA-Z.,-]+.*"
That means:
"." Begin with any character
"*" Have zero or more of these characters
"[0-9a-zA-Z.,-]" Have characters in the range 0-9, a-z, etc, or . or , or -
"+" Have one or more of this kind of character (so it's mandatory to have one in this set)
"." End with any character
"*" Have zero or more of these characters
This is working ok for me, hope it helps!

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

A hero regular expression challenge - grep

You need to add . in the second character class condition. So that it would match the string CHRIS.INTERNET.COM, $ grep -oP '[\w-](?:CHRIS|DAVE)[\w.-]' file fooCHRISbar DAVE123 hereCHRISDAVE CHRIS.internet.com

Related

how to grep a word with only one single capital letter?

Lua | String Pattern exclusion

What does this pattern ^[%w-.]+$ mean in Lua?

How to update this REGEX to make sure string does not have _(underscore) at the end or beigning

regex validation - grails constraints

Categories

Resources

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

A hero regular expression challenge - grep

You need to add . in the second character class condition. So that it would match the string CHRIS.INTERNET.COM, $ grep -oP '[\w-]*(?:CHRIS|DAVE)[\w.-]*' file fooCHRISbar DAVE123 hereCHRISDAVE CHRIS.internet.com

Related

how to grep a word with only one single capital letter?

Lua | String Pattern exclusion

What does this pattern ^[%w-.]+$ mean in Lua?

How to update this REGEX to make sure string does not have _(underscore) at the end or beigning

regex validation - grails constraints

Categories

Resources

You need to add . in the second character class condition. So that it would match the string CHRIS.INTERNET.COM, $ grep -oP '[\w-](?:CHRIS|DAVE)[\w.-]' file fooCHRISbar DAVE123 hereCHRISDAVE CHRIS.internet.com