How to use non-capturing groups in grep?

How to use non-capturing groups in grep? - grep

This answer suggests that grep -P supports the (?:pattern) syntax, but it doesn't seem to work for me (the group is still captured and displayed as part of the match). Am I missing something?
I am trying grep -oP "(?:syntaxHighlighterConfig\.)[a-zA-Z]+Color" SyntaxHighlighter.js on this code, and expect the results to be:
wikilinkColor
externalLinkColor
parameterColor
...
but instead I get:
syntaxHighlighterConfig.wikilinkColor
syntaxHighlighterConfig.externalLinkColor
syntaxHighlighterConfig.parameterColor
...

"Non-capturing" doesn't mean that the group isn't part of the match; it means that the group's value isn't saved for use in back-references. What you are looking for is a look-behind zero-width assertion:
grep -Po "(?<=syntaxHighlighterConfig\.)[a-zA-Z]+Color" file

Related

Grep Individual Commands not working when combined in Multi Pattern grep command

I have a need to perform multiple grep matches as part of the same grep command. When I run them individually, they work fine. But not when together. I hope someone could either show me a solution or perhaps can help me find a work-around. Here is sample stream:
(string start..) RollUp:"V" Enzyme:"ENZA ENZB ENZD ENZE" (..string end)
In the first command I am needing to isolate all RollUp substrings.Value is always A or V:
grep -o "RollUp:\"[AV]\""
In the second command I am needing to isolate all combinations of Enzyme values (1-20 total, spaces in between, don't know values names). This command works:
grep -oE 'Enzyme:[[:space:]]*"[^"]+"'
However, I need to match both patterns as part of same stream. When I try:
grep -oE "RollUp:\"[AV]\""\|Enzyme:[[:space:]]*"[^"]+""
, nothing is returned. I would be grateful for any ideas for getting this double grep pattern match to work. Thank you!

regex someting[^"]+ : this means string something followed by anything till next " is seen. Here + sign means , at least one or more match.
grep -oE 'RollUp:"[^"]+|Enzyme:[[:space:]]*"[^"]+"' file

Match pattern ending with a certain character in grep

This is a common problem I encounter when using grep. Say the pattern is 'chr1' in a third column of a file, when I do the following:
grep 'chr1' file
How can I avoid getting the results including chr10, chr11, chr13 etc as well?
Thanks!

It seems this works:
grep -w 'chr1' file

Since you're interested in values in specific columns, you're much better off using awk:
awk '$3 == "chr1"' file

How to use grep to search for an exact word match in TextWrangler

There is a possibility to search using grep in TextWrangler
I want to find and replace the following word: bauvol, but not bauvolumen.
I tried typing ^bauvol$ into the search field but that didn't do the trick, it didn't find anything, although the word is clearly there.
I think it's because, in grep, the ^and $signify start and end of line, not a word?!

You want to use \b as word boundaries, as #gromi08 said:
\bbauvol\b
If you want to copy any portion of this word (so you can replace it, modify it, change the case, etc.) it is usually best to wrap it in ( and ) braces so you can reference them in the Replace box:
Find:
(\bbauvol\b)
Replace:
<some_tag>\1</some_tag>
Did you have anything specific you were trying to do with the result once you found it (cut it, duplicate it, etc.)?

Use the -w option of grep (see grep man-page.
This option searches for the expression as a word.
Therefore the command will be:
cat file.txt | grep -w bauvol
And yes, ^ and $ are for start and end of line.

grep from beginning of found word to end of word

I am trying to grep the output of a command that outputs unknown text and a directory per line. Below is an example of what I mean:
.MHuj.5.. /var/log/messages
The text and directory may be different from time to time or system to system. All I want to do though is be able to grep the directory out and send it to a variable.
I have looked around but cannot figure out how to grep to the end of a word. I know I can start the search phrase looking for a "/", but I don't know how to tell grep to stop at the end of the word, or if it will consider the next "/" a new word or not. The directories listed could change, so I can't assume the same amount of directories will be listed each time. In some cases, there will be multiple lines listed and each will have a directory list in it's output. Thanks for any help you can provide!

If your directory paths does not have spaces then you can do:
$ echo '.MHuj.5.. /var/log/messages' | awk '{print $NF}'
/var/log/messages

It's not clear from a single example whether we can generalize that e.g. the first occurrence of a slash marks the beginning of the data you want to extract. If that holds, try
grep -o '/.*' file
To fetch everything after the last space, try
grep -o '[^ ]*$' file
For more advanced pattern matching and extraction, maybe look at sed, or Awk or Perl or Python.

Your line can be described as:
^\S+\s+(\S+)$
That's assuming whitespace is your delimiter between the random text and the directory. It simply separates the whitespace from the non-whitespace and captures the second part.
Or you might want to look into the word boundary character class: \b.

I know you said to use grep, but I can't help to mention that this is trivially done using awk:
awk '{ print $NF }' input.txt
This is assuming that a whitespace is the delimiter and that the path does not contain any whitespaces.

Opposite of "only-matching" in grep?

Is there any way to do the opposite of showing only the matching part of strings in grep (the -o flag), that is, show everything except the part that matches the regex?
That is, the -v flag is not the answer, since that would not show files containing the match at all, but I want to show these lines, but not the part of the line that matches.
EDIT: I wanted to use grep over sed, since it can do "only-matching" matches on multi-line, with:
cat file.xml|grep -Pzo "<starttag>.*?(\n.*?)+.*?</starttag>"

This is a rather unusual requirement, I don't think grep would alternate the strings like that. You can achieve this with sed, though:
sed -n 's/$PATTERN//gp' file
EDIT in response to OP's edit:
You can do multiline matching with sed, too, if the file is small enough to load it all into memory:
sed -rn ':r;$!{N;br};s/<starttag>.*?(\n.*?)+.*?<\/starttag>//gp' file.xml

You can do that with a little help from sed:
grep "pattern" input_file | sed 's/pattern//g'

I don't think there is a way in grep.
If you use ack, you could output Perl's special variables $` and $' variables to show everything before and after the match, respectively:
ack string --output="\$`\$'"
Similarly if you wanted to output what did match along with other text, you could use $& which contains the matched string;
ack string --output="Matched: $&"

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to use non-capturing groups in grep? - grep

"Non-capturing" doesn't mean that the group isn't part of the match; it means that the group's value isn't saved for use in back-references. What you are looking for is a look-behind zero-width assertion: grep -Po "(?<=syntaxHighlighterConfig\.)[a-zA-Z]+Color" file

Related

Grep Individual Commands not working when combined in Multi Pattern grep command

Match pattern ending with a certain character in grep

How to use grep to search for an exact word match in TextWrangler

grep from beginning of found word to end of word

Opposite of "only-matching" in grep?

Categories

Resources