so I'm struggling with regex. I'll start with what I want to achieve and then proceed to what I have "so far".
So for example I have commit name lines
merge(#2137): done something
Merge pull request #420 from Example/branch
feat(): done something [#2137JDN]
merge(#690): feat(): done something [#2137JDN]
And I want to grep only by PR ID, or if it's not there then it'd search by that second hash
#2137
#420
#2137JDN
#690
For now I have this regex, but it's not perfect
/(\(|\s|\[)(#\d+|#.+)(\)|\s|\])/g
because it's capturing this
(#2137)
\s#420\s
[#2137JDN]
(#690)[#2137JDN]
How I can improve it to get what I want exactly?
You can use the #[\dA-Z]+ pattern to grep only hashes.
command | grep -Po "#[\dA-Z]+"
Which returns the all matched strings (in our case - hashes)
#2137
#420
#2137JDN
#690
#2137JDN
Unfortunately, grep does not support non-greedy feature. See this answer.
Related
This question already has answers here:
How do you use a plus symbol with a character class as part of a regular expression?
(3 answers)
Closed 2 years ago.
I have a problem to work on and was wondering why my regex won't work. It's a simple exercise to match words in a text dictionary that contains the top row. I believe I have a solution but grep comes up blank every time:
grep ^[qwertyuiop]+$ /opt/~~~~~~/data/web2
this is my command, which does nothing, but if i just put:
grep [qwertyuiop] /opt/~~~~~~/data/web2
it matches words with letters from the top row. Can anybody tell me why it isn't working? Thank you all for your time.
you're super close.
With grep you want to use the -x flag to match the whole line.
grep -x '[qwertyuiop]\+' /usr/share/dict/american-english
then a simple escaped + to match multiple characters.
if you want to avoid the -x you can take your original approach like so:
grep '^[qwertyuiop]\+$' /usr/share/dict/american-english
With an escape and some quotes it works marvelously, although i think the -x is more idiomatic, as some other people have commented, you can also get away with using -e although that can have some unintended consequences. I recommend man grep which gives a nice overview.
I don't think grep recognizes ^ $ or + on it's own. You have to use grep -e or egrep to use special characters like that
I try to grab some activities in a logfile with grep.
The logfile looks like the line below:
[07/16/2019 14:09:26:516 CEST] 00018e0a main I [APPName][DEBUG][Various Activity Name][SERVICE] node: <resultSet recordCount="3" columnCount="6">
So my target is to grab over the logfile and get all activity names (various and sometimes unknown names).
I tried with regex but I can't reach the final goal :-P.
How can i grab like (here comes my pseudoCode):
grep "[AppName][DEBUG] [*]" logfile.log
You're probably better off using sed for this. Just make sure that you are escaping characters properly. Something like this should work and you can pipe it to sort and uniq to get a unique list of activity names:
sed 's/.*\[APPName\]\[DEBUG\]\[\([^]]*\)\].*/\1/' logfile.log|sort|uniq
This is a common problem I encounter when using grep. Say the pattern is 'chr1' in a third column of a file, when I do the following:
grep 'chr1' file
How can I avoid getting the results including chr10, chr11, chr13 etc as well?
Thanks!
It seems this works:
grep -w 'chr1' file
Since you're interested in values in specific columns, you're much better off using awk:
awk '$3 == "chr1"' file
This answer suggests that grep -P supports the (?:pattern) syntax, but it doesn't seem to work for me (the group is still captured and displayed as part of the match). Am I missing something?
I am trying grep -oP "(?:syntaxHighlighterConfig\.)[a-zA-Z]+Color" SyntaxHighlighter.js on this code, and expect the results to be:
wikilinkColor
externalLinkColor
parameterColor
...
but instead I get:
syntaxHighlighterConfig.wikilinkColor
syntaxHighlighterConfig.externalLinkColor
syntaxHighlighterConfig.parameterColor
...
"Non-capturing" doesn't mean that the group isn't part of the match; it means that the group's value isn't saved for use in back-references. What you are looking for is a look-behind zero-width assertion:
grep -Po "(?<=syntaxHighlighterConfig\.)[a-zA-Z]+Color" file
What is the way to use 'grep' to search for combinations of a pattern in a text file?
Say, for instance I am looking for "by the way" and possible other combinations like "way by the" and "the way by"
Thanks.
Awk is the tool for this, not grep. On one line:
awk '/by/ && /the/ && /way/' file
Across the whole file:
gawk -v RS='\0' '/by/ && /the/ && /way/' file
Note that this is searching for the 3 words, not searching for combinations of those 3 words with spaces between them. Is that what you want?
Provide more details including sample input and expected output if you want more help.
The simplest approach is probably by using regexps. But this is also slightly wrong:
egrep '([ ]*(by|the|way)\>){3}'
What this does is to match on the group of your three words, taking spaces in front of the words
with it (if any) and forcing it to be a complete word (hence the \> at the end) and matching the string if any of the words in the group occurs three times.
Example of running it:
$ echo -e "the the the\nby the\nby the way\nby the may\nthe way by\nby the thermo\nbypass the thermo" | egrep '([ ]*(by|the|way)\>){3}'
the the the
by the way
the way by
As already said, this procudes a 'false' positive for the the the but if you can live with that, I'd recommend doing it this way.