how to exclude some of the matches from grep? - grep

I am using grep to printout the matching lines from a very large file
from which i got hundreds of matches, some of them are not interesting i want to exclude those matching which are not interesting
grep "WARNING" | grep -v "WARNING_HANDLING_THREAD" path # i tried this
When I grep the file for warning I get
0-00:00:33.392 (2127:127:250:02 = 21.278532 Fri Feb 1 10:17:22 2019) <3:0x000a>:[89]:[enter]: cest_handleFreeReq.c:116: [WARNING]: cest_handleFreeReq: sent from DECA ->UCS
0-00:00:38.263 (2189:022:166:06 = 21.891510 Fri Feb 1 10:17:28 2019) <3:0x000a>:[89]:[enter]: cest_handleConfigReq.c:176: [WARNING]: cest_handleConfigReq.c: GroupConfig NOT present.
0-00:00:38.263 (2189:022:167:03 = 21.891510 Fri Feb 1 10:17:28 2019) <3:0x000a>:[89]:[enter]: cest_handleConfigReq.c:194: [WARNING]: cest_handleConfigReq: physicalConfig NOT present.
60 0x6d77 0 0x504ea | 2 18 | 0 0 | 4 12 | 647 | 14685 0 0.0 0 500 500 | 0 | 0 | 38 | ETH_DRV_WARNING_HANDLING_thread
60 0 | 0 0 | 0 0 0 | 0 0 0 0 0 0 ! N/A N/A N/A N/A N/A N/A |ETH_DRV_WARNING_HANDLING_thread
WARNING: List of threads violating the heap & stack limit
I want to exclude the last lines which are not interesting
0-00:00:33.392 (2127:127:250:02 = 21.278532 Fri Feb 1 10:17:22 2019) <3:0x000a>:[89]:[enter]: cest_handleFreeReq.c:116: [WARNING]: cest_handleFreeReq: sent from DECA ->UCS
0-00:00:38.263 (2189:022:166:06 = 21.891510 Fri Feb 1 10:17:28 2019) <3:0x000a>:[89]:[enter]: cest_handleConfigReq.c:176: [WARNING]: cest_handleConfigReq.c: GroupConfig NOT present.
0-00:00:38.263 (2189:022:167:03 = 21.891510 Fri Feb 1 10:17:28 2019) <3:0x000a>:[89]:[enter]: cest_handleConfigReq.c:194: [WARNING]: cest_handleConfigReq: physicalConfig NOT present.
Is there a way to do this using grep find or any other tool?
Thank you

Note that the substring thread is in lower case in the data, but in upper case in your expression.
Instead, use
grep -F 'WARNING' logfile | grep -F -v 'WARNING_HANDLING_thread'
The -F make grep use string comparisons rather than regular expression matching (this is not really related to your current issue, but just a way of showing that we know what type of pattern we're matching with).
Another option would be to make the second grep do case insensitive matching with -i:
grep -F 'WARNING' logfile | grep -Fi -v 'WARNING_HANDLING_THREAD'
In this case though, I would probably match the [WARNING] tag instead:
grep -F '[WARNING]:' logfile
Note that here we need the -F so that grep interprets the pattern as a string and not as a regular expression matching any single character out of the W, A, R, N, I, G set, followed by a :.

Related

grep invert match on two files

I have two text files containing one column each, for example -
File_A File_B
1 1
2 2
3 8
If I do grep -f File_A File_B > File_C, I get File_C containing 1 and 2. I want to know how to use grep -v on two files so that I can get the non-matching values, 3 and 8 in the above example.
Thanks.
You can also use comm if it allows empty output delimiter
$ # -3 means suppress lines common to both input files
$ # by default, tab character appears before lines from second file
$ comm -3 f1 f2
3
8
$ # change it to empty string
$ comm -3 --output-delimiter='' f1 f2
3
8
Note: comm requires sorted input, so use comm -3 --output-delimiter='' <(sort f1) <(sort f2) if they are not already sorted
You can also pass common lines got from grep as input to grep -v. Tested with GNU grep, some version might not support all these options
$ grep -Fxf f1 f2 | grep -hxvFf- f1 f2
3
8
-F option to match strings literally, not as regex
-x option to match whole lines only
-h to suppress file name prefix
f- to accept stdin instead of file input
awk 'NR==FNR{a[$0]=$0;next} !($0 in a) {print a[(FNR)], $0}' f1 f2
3 8
To Understand the meaning of NR and FNR check below output of their print.
awk '{print NR,FNR}' f1 f2
1 1
2 2
3 3
4 4
5 1
6 2
7 3
8 4
Condition NR==FNR is used to extract the data from first file as both NR and FNR would be same for first file only.
With GNU diff command (to compare files line by line):
diff --suppress-common-lines -y f1 f2 | column -t
The output (left column contain lines from f1, right column - from f2):
3 | 8
-y, --side-by-side - output in two columns

Similar lines in log, only want to grep 1 of them

a log generates errors. it generates two very similar lines for 1 error
I want to grep these errors but only one of the lines
for example:
Line is : Mar 21 15:33:04 VMP05 SMC_User: FATAL ECSDPROD 5210/SUPPORT/ECSD 21/03/17 15:33:04 VMD25 DIR_CHECK 0 FATAL File /data1/gmq6/in/29920991077061 is more than 10 minutes old
Line is : Mar 21 15:33:04 VMP05 SMC_User: FATAL ECSDPROD 5210/SUPPORT/ECSD 21/03/17 15:33:04 VMD18 DIR_CHECK 0 FATAL File /data1/sftp/out/26515991064454 is more than 10 minutes old
Mar 21 15:33:04 VMP05 SMC_User: FATAL ECSDPROD 5210/SUPPORT/ECSD 21/03/17 15:33:04 VMD25 DIR_CHECK 0 FATAL File /data1/gmq6/in/29920991077061 is more than 10 minutes old
Mar 21 15:33:04 VMP05 SMC_User: FATAL ECSDPROD 5210/SUPPORT/ECSD 21/03/17 15:33:04 VMD18 DIR_CHECK 0 FATAL File /data1/sftp/out/26515991064454 is more than 10 minutes old
but I only want to grep the lines without 'Line is'. I am using Hewlett Packard Linux
EDIT:
this grep is needed within a tail -f :
#!/usr/bin/ksh
echo "checking for last 10 fatals"
grep "FATAL ECSDPROD" /data1/log/startstop/MonitorDaemon.log|tail > /tmp/AH/linesDP.txt
grep "FATAL ECSD" /tmp/AH/linesDP.txt | grep -v "Line is"
echo "\n\n----------\n"
echo "checking for new fatals"
tail -f /data1/log/startstop/MonitorDaemon.log | grep "FATAL ECSD" | grep -v "Line is"
echo "about to exit"
exit 0
with the above the tail isn't updating, the script gets all the way down to the echo "checking for new fatals" then it won't tail the log
Maybe this helps.
tail -f /data1/log/startstop/MonitorDaemon.log | grep --line-buffered "FATAL ECSD" | grep -v "Line is"
Elaboration: You can use the pipe symbol ( | ) multiple times in one command!
Source
I think I might of solved this by using a combination of awk and grep.
I am testing it at the moment but so far it works.
#!/usr/bin/ksh
echo "Checking For The Last 5 Fatals"
grep "FATAL ECSDPROD" /data1/log/startstop/MonitorDaemon.log|tail > /tmp/AH/linesDP.txt
grep "FATAL ECSD" /tmp/AH/linesDP.txt | grep -v "Line is"
echo "\n\n----------\n"
echo "Checking For New Fatals"
tail -f /data1/log/startstop/MonitorDaemon.log | awk '/FATAL ECS/' | grep -v "Line is"
echo "about to exit"
exit 0

How do I grab a specific section of a stdout?

I am trying to grab the sda# of a drive that was just inserted.
tail -f /var/log/messages | grep sda:
Returns: Mar 12 17:21:55 raspberrypi kernel: [ 1133.736632] sda: sda1
I would like to grab the sda1 part of the stdout, how would I do that?
I suggest to use this with GNU grep:
| grep -Po 'sd[a-z]+: \Ksd[a-z0-9]+$'
\K: This sequence resets the starting point of the reported match. Any previously matched characters are not included in the final matched sequence.
See: The Stack Overflow Regular Expressions FAQ

grep specific pattern from a log file

I am passing all my svn commit log messages to a file and want to grep only the JIRA issue numbers from that.
Some lines might have more than 1 issue number, but I want to grab only the first occurrence.
The pattern is XXXX-999 (number of alpha and numeric char is not constant)
Also, I don't want the entire line to be displayed, just the JIRA number, without duplicates. I use the following command but it didn't work.
Could someone help please?
cat /tmp/jira.txt | grep '^[A-Z]+[-]+[0-9]'
Log file sample
------------------------------------------------------------------------
r62086 | userx | 2015-05-12 11:12:52 -0600 (Tue, 12 May 2015) | 1 line
Changed paths:
M /projects/trunk/gradle.properties
ABC-1000 This is a sample commit message
------------------------------------------------------------------------
r62084 | usery | 2015-05-12 11:12:12 -0600 (Tue, 12 May 2015) | 1 line
Changed paths:
M /projects/training/package.jar
EFG-1001 Test commit
Output expected:
ABC-1000
EFG-1001
First of all, it seems like you have the second + in the wrong place, it should be at the end of [0-9] expression.
Second, I think all you need to do this is use the -o option to grep (to display only the matching portion of the line), then pipe the grep output through sort -u, like this:
cat /tmp/jira.txt | grep -oE '^[A-Z]+-[0-9]+' | sort -u
Although if it were me, I'd skip the cat step and just give the filename to grep, as so:
grep -oE '^[A-Z]+-[0-9]+' /tmp/jira.txt | sort -u
Six of one, half a dozen of the other, really.

Need grep help on substring pattern from one file and match against another to see if pattern exists

I have an input flat file like this with many rows:
Apr 3 13:30:02 aag8-ca-acs01-en2 CisACS_01_PassedAuth p1n5ut5s 1 0 Message-Type=Authen OK,User-Name=joe7#it.test.com,NAS- IP-Address=4.196.63.55,Caller-ID=az-4d-31-89-92-90,EAP Type=17,EAP Type Name=LEAP,Response Time=0,
Apr 3 13:30:02 aag8-ca-acs01-en2 CisACS_01_PassedAuth p1n6ut5s 1 0 Message-Type=Authen OK,User-Name=bobe#jg.test.com,NAS-IP-Address=4.197.43.55,Caller-ID=az-4d-4q-x8-92-80,EAP Type=17,EAP Type Name=LEAP,Response Time=0,
Apr 3 13:30:02 abg8-ca-acs01-en2 CisACS_01_PassedAuth p1n4ut5s 1 0 Message-Type=Authen OK,User-Name=jerry777#it.test.com,NAS-IP-Address=7.196.63.55,Caller-ID=az-4d-n6-4e-y2-90,EAP Type=17,EAP Type Name=LEAP,Response Time=0,
Apr 3 13:30:02 aca8-ca-acs01-en2 CisACS_01_PassedAuth p1n4ut5s 1 0 Message-Type=Authen OK,User-Name=frctom#pe.test.com,NAS-IP-Address=4.196.263.55,Caller-ID=az-4d-x1-d3-c2-90,EAP Type=17,EAP Type Name=LEAP,Response Time=0,
Apr 3 13:30:02 aag8-ca-acs01-en2 CisACS_01_PassedAuth p1n4ut5s 1 0 Message-Type=Authen OK,User-Name=frc77#xed.test.com,NAS-IP-Address=4.136.163.55,Caller-ID=az-4d-4w-b5-s2-90,EAP Type=17,EAP Type Name=LEAP,Response Time=0,
Apr 3 13:30:02 aag8-ca-acs01-en2 CisACS_01_PassedAuth p1n4ut5s 1 0 Message-Type=Authen OK,User-Name=petejg#it.test.com,NAS-IP-Address=4.136.62.55,Caller-ID=az-4e-31-x3-92-c0,EAP Type=17,EAP Type Name=LEAP,Response Time=0
I'm trying to grep the email addresses from input file to see if they already exist in the master file.
Master flat file looks like this:
a44e31999290;frc777o.#it.test.com;20150403
az4d4qx89280;bobe#jg.test.com;20150403
0dbgd0fed04t;rrfuf#us.test.com;20150403
28cbe9191d53;rttuu4en#us.test.com;20150403
az4d4wb5s290;frc77#xed.test.com;20150403
d89695174805;ccis6n#cn.test.com;20150403
s00dbg0fe04t;rrfuuuf#be.test.com;20150403
If the email doesn't exist in master I want a simple count. So using the examples I hope to see `count=5 (bobe#jg.test.com & frc77#xed.test.com exist in master but the others don't).
I have tried various combinations of grep, the one below is what I was testing last but it still does not work.. I'm using this within a perl script to first capture emails and then count them but all I really need is the count of emails from input file that don't exist in master.
grep -o -P '(?<=User-Name=\).*(?=,NAS-IP-)' $infile $mstr > $new_emails;
Any help would be appreciated, Thanks.
It's not exactly a one-liner, but this works for me:
for email in $(sed "s/.*User-Name=\(.[^,]*\),.*/\1/g" input.txt); do
grep -oc $email master.txt
done | sort | uniq -c | awk '{if ($2==0) print $1}'
Explanation:
The sed command gets me a clean list of email addresses from the input file:
$ sed "s/.*User-Name=\(.[^,]*\),.*/\1/g" input.txt
joe7#it.test.com
bobe#jg.test.com
jerry777#it.test.com
frctom#pe.test.com
frc77#xed.test.com
petejg#it.test.com
The grep command looks for each of these addresses in the master file and (because of the -c flag) returns 0 for no match and 1 for match:
$ for email in $(sed "s/.*User-Name=\(.[^,]*\),.*/\1/g" input.txt); do
$ grep -oc $email master.txt
$ done
0
1
0
0
1
0
The sort and uniq commands get the frequency of matches and non-matches:
$ |sort | uniq -c
4 0
2 1
And finally the awk command prints out the number of non-matches (it will print the first column only if the second column is 0):
$ awk '{if ($2==0) print $1}'
4

Resources