Grep Individual Commands not working when combined in Multi Pattern grep command - grep

I have a need to perform multiple grep matches as part of the same grep command. When I run them individually, they work fine. But not when together. I hope someone could either show me a solution or perhaps can help me find a work-around. Here is sample stream:
(string start..) RollUp:"V" Enzyme:"ENZA ENZB ENZD ENZE" (..string end)
In the first command I am needing to isolate all RollUp substrings.Value is always A or V:
grep -o "RollUp:\"[AV]\""
In the second command I am needing to isolate all combinations of Enzyme values (1-20 total, spaces in between, don't know values names). This command works:
grep -oE 'Enzyme:[[:space:]]*"[^"]+"'
However, I need to match both patterns as part of same stream. When I try:
grep -oE "RollUp:\"[AV]\""\|Enzyme:[[:space:]]*"[^"]+""
, nothing is returned. I would be grateful for any ideas for getting this double grep pattern match to work. Thank you!

regex someting[^"]+ : this means string something followed by anything till next " is seen. Here + sign means , at least one or more match.
grep -oE 'RollUp:"[^"]+|Enzyme:[[:space:]]*"[^"]+"' file

Related

Grep last match until the end

I've looked around StackExchange sites but I haven't found anything that's quite what I'm looking for. Here are two use cases of grep:
Printing items before/after a match
Print a certain match
I'm trying to parse a log file, and I want to return the last error in the log which is, predictably, at the end of the file. However, sometimes the errors are multiple lines. The answers for 'how to grep the last match' all involve either tail or head, and only work with a single line.
In my case, I want to simply return everything in the file, starting with the last match. Typically, this won't be any more than 10-15 lines maximum, so a grep -A 15 does the trick there. But, I still need to only get the last one of these, so that alone doesn't produce the right output.
The naive approach is to use a two-part match, to first get what the last match is and then everything after that. This won't work for me, because I can't guarantee that the last match is unique.
Is it possible to do this with grep somehow, or would there be better tools for this?
There is a way to get sed to do this but I can't remember.
If you are open to using a combination of commands here is something that might work:
# Get the line number of teh last match
LNO=$( grep -n 'the error' the_file | tail -1 | cut -d":" -f1 )
# Now use sed to print all lines from that point:
sed -n "$LNO,\$p" the_file
I think there's an exact duplicate somewhere, but I found only these close ones:
How to get lines from the last match to the end of file?
grep last match and it's following lines
Here's one way to do it:
$ cat ip.txt
foo123
error 1
xyz
error 2
99999
88888
$ tac ip.txt | sed '/error/q' | tac
error 2
99999
88888

duplicate grep output when comparing two files

I have literally been at this for 5 hours, I have busybox on my device, and I unfortunately do not have -X in grep to make my life easier.
edit;
I have two list both of them have mac addresses, essentially I am just wanting to achieve offline mac address lookup so I don't have to keep looking it up online
list.txt has vendor mac prefix of course this isn't the complete list but just for an example
00:13:46
00:15:E9
00:17:9A
00:19:5B
00:1B:11
00:1C:F0
scan will have list of different mac addresses unknown to which vendor they go to. Which will be full length mac addresses. when ever there is a match I want the line in scan to be output.
Pretty much it does that, but it outputs everything from the scan file, and then it will output matching one at the end, and causing duplicate. I tried sort -u, but it has no effect its as if there is two different output from two different methods, the reason why I say that is because it will instantly output scan file that has everything in it, and couple seconds later it will output the matching one.
From searching I came across this
#!/bin/bash
while read line; do
grep -F 'list' 'scan'
done < list.txt
which displays the duplicate result when/if found, the output is pretty much echoing my scan file then displaying the matched pattern, this creating duplicate
This is frustrating me that I have not found a solution after click on all the links in google up to page 9.
Please someone help me.
I don't know if the Busybox sed supports this out of the box, but it should be easy to do in Awk or Perl instead then.
Create a sed script to print lines from file2 which are covered by a prefix in file1 by transforming each line in file1 into a sed command to print a match for that regular expression:
sed 's%.*%/&/p%' file1 | sed -n -f - file2
The same in Awk:
awk 'NR==FNR { a[++i]="^" $0; next }
{ for (j=1; j<=i; ++j) if ($0 ~ a[j]) print }' file1 file2
Ok guys I did a nested for loop (probably very in efficient) but I got it working printing the matching mac addresses using this
#!/usr/bin/bash
for scanlist in `cat scan | cut -d: -f1,2,3`
do
for listt in `cat list`
do
if [[ $scanlist == $listt ]]; then
grep $scanlist scan
fi
done
done
if anyone can make this more elegant but it works for me for now. I think the problem I had was one list contained just 00:11:22 while my other list contained 00:11:22:33:44:55 that is why I cut it on my scanlist to make same length as my other list. So this only output the matches instead of doing duplicate output.

grep from beginning of found word to end of word

I am trying to grep the output of a command that outputs unknown text and a directory per line. Below is an example of what I mean:
.MHuj.5.. /var/log/messages
The text and directory may be different from time to time or system to system. All I want to do though is be able to grep the directory out and send it to a variable.
I have looked around but cannot figure out how to grep to the end of a word. I know I can start the search phrase looking for a "/", but I don't know how to tell grep to stop at the end of the word, or if it will consider the next "/" a new word or not. The directories listed could change, so I can't assume the same amount of directories will be listed each time. In some cases, there will be multiple lines listed and each will have a directory list in it's output. Thanks for any help you can provide!
If your directory paths does not have spaces then you can do:
$ echo '.MHuj.5.. /var/log/messages' | awk '{print $NF}'
/var/log/messages
It's not clear from a single example whether we can generalize that e.g. the first occurrence of a slash marks the beginning of the data you want to extract. If that holds, try
grep -o '/.*' file
To fetch everything after the last space, try
grep -o '[^ ]*$' file
For more advanced pattern matching and extraction, maybe look at sed, or Awk or Perl or Python.
Your line can be described as:
^\S+\s+(\S+)$
That's assuming whitespace is your delimiter between the random text and the directory. It simply separates the whitespace from the non-whitespace and captures the second part.
Or you might want to look into the word boundary character class: \b.
I know you said to use grep, but I can't help to mention that this is trivially done using awk:
awk '{ print $NF }' input.txt
This is assuming that a whitespace is the delimiter and that the path does not contain any whitespaces.

How to use non-capturing groups in grep?

This answer suggests that grep -P supports the (?:pattern) syntax, but it doesn't seem to work for me (the group is still captured and displayed as part of the match). Am I missing something?
I am trying grep -oP "(?:syntaxHighlighterConfig\.)[a-zA-Z]+Color" SyntaxHighlighter.js on this code, and expect the results to be:
wikilinkColor
externalLinkColor
parameterColor
...
but instead I get:
syntaxHighlighterConfig.wikilinkColor
syntaxHighlighterConfig.externalLinkColor
syntaxHighlighterConfig.parameterColor
...
"Non-capturing" doesn't mean that the group isn't part of the match; it means that the group's value isn't saved for use in back-references. What you are looking for is a look-behind zero-width assertion:
grep -Po "(?<=syntaxHighlighterConfig\.)[a-zA-Z]+Color" file

Opposite of "only-matching" in grep?

Is there any way to do the opposite of showing only the matching part of strings in grep (the -o flag), that is, show everything except the part that matches the regex?
That is, the -v flag is not the answer, since that would not show files containing the match at all, but I want to show these lines, but not the part of the line that matches.
EDIT: I wanted to use grep over sed, since it can do "only-matching" matches on multi-line, with:
cat file.xml|grep -Pzo "<starttag>.*?(\n.*?)+.*?</starttag>"
This is a rather unusual requirement, I don't think grep would alternate the strings like that. You can achieve this with sed, though:
sed -n 's/$PATTERN//gp' file
EDIT in response to OP's edit:
You can do multiline matching with sed, too, if the file is small enough to load it all into memory:
sed -rn ':r;$!{N;br};s/<starttag>.*?(\n.*?)+.*?<\/starttag>//gp' file.xml
You can do that with a little help from sed:
grep "pattern" input_file | sed 's/pattern//g'
I don't think there is a way in grep.
If you use ack, you could output Perl's special variables $` and $' variables to show everything before and after the match, respectively:
ack string --output="\$`\$'"
Similarly if you wanted to output what did match along with other text, you could use $& which contains the matched string;
ack string --output="Matched: $&"

Resources