remove word using "grep -v" exact match only - grep

How can I do an exact match using grep -v?
For example: the following command
for i in 0 0.3 a; do echo $i | grep -v "0"; done
returns a. But I want it to return 0.3 a.
Using
grep -v "0\b"
is not working

for i in 0 0.3 a; do echo $i | grep -v "^0$"; done
You need to match the start and end of the string with ^ and $
So, we say "match the beginning of a line, the char 0 and then the end of the line.
$ for i in 0 0.3 a; do echo $i | grep -v "^0$"; done
0.3
a

The safest way for single-column entries is using awk. Normally, I would use grep with the -w flag, but since you want to exactly match an integer that could be part of a float, it is a bit more tricky. The <dot>-character makes it hard to use any of
grep -vw 0
grep -v '\b0\b'
grep -v '\<0\>'
The proposed solution also will only work on perfect lines, what if you have a lost space in front or after your zero. The line will fail. So the safest would be:
single column file:
awk '($1!="0")' file
multi-word file: (adopt the variable FS to fit your needs)
awk '{for(i=1;i<=NF;++i) if($i == "0") next}1' file

Related

grep count all matches in one line of single pattern

How can I count all matches with grep, when there are more than one match per line?
$ cat example.txt
int foo(int a, char b) { int i = 80; return i; }
$ grep -c "int" example.txt
1
I want to output 3 (since int appears 3 times in the file)
With grep that supports -o option:
$ grep -o 'int' ip.txt | wc -l
3
With ripgrep:
$ rg -oc 'int' ip.txt
3
The best is to use awk for this:
$ awk '{c+=gsub(/int/,"")}END{print c}' file
The function gsub is used to perform substitutions and returns the total amount of substitutions done. In the above, we replace the regular expression int with an empty string to do the counting. This is done for every line of file. Per line, we add the count to the variable c which is by default initialized to zero. At the END of the script, we print the value of c.
Consider this awk approach using search word a field separator:
awk -F 'int' '{print NF-1}' file
3

Linux RegEx Grep Repeat character from n to m times

I have a problem with this Linux command:
ls | grep -E 'i{2,3}'
.It should take a file that has at least 2 i and max 3 i, but it doesn't work.
This is the output
ls:
life.py, viiva.txt, viiiiiiiiiva.txt
grep:
viiva.txt, viiiiiiiiiva.txt (with the first 3 I highlighted)
Thanks for the help.
Issue with OP's attempt grep -E 'i{2,3}' will match two or three consecutive occurrences of i anywhere in the input, so 4 or more consecutive i is also a valid match.
Parsing ls output is not recommended, see Why not parse ls (and what to do instead)?. If you wish to pass the filenames after filtering to some other command, find is a good option.
$ ls
1i2i3i.txt aibi.txt II.txt life.py viiiiiiiiiva.txt viiva.txt
$ # files with 2 or 3 consecutive i
$ # note that the regex will act on entire filename, thus anchors are not needed
$ find -type f -regextype egrep -regex '[^i]*i{2,3}[^i]*'
./viiva.txt
$ # files with 2 or 3 i anywhere in the name
$ find -type f -regextype egrep -regex '[^i]*i[^i]*i[^i]*(i[^i]*)?'
./aibi.txt
./1i2i3i.txt
./viiva.txt
$ # files with 2 or 3 i anywhere in the name, ignoring case
$ find -type f -regextype egrep -iregex '[^i]*i[^i]*i[^i]*(i[^i]*)?'
./II.txt
./aibi.txt
./1i2i3i.txt
./viiva.txt
If filenames won't cause an issue, you can grep -xE or grep -ixE with above regex, where x option will make sure the regex matches the whole line, instead of anywhere in the line. Or you can also use awk:
$ # NF will give number of fields after splitting on i
$ ls | awk -F'i' 'NF>=3 && NF<=4'
1i2i3i.txt
aibi.txt
viiva.txt
$ ls | awk -F'[iI]' 'NF>=3 && NF<=4'
1i2i3i.txt
aibi.txt
II.txt
viiva.txt

XOR operator with grep when we grep on 2 patterns

I am using grep to find match of 2 patterns with condition OR like this, classically:
grep -E 'C_matrix|F_matrix' triplot_XC_dev_for_right_order_with_FoM.py | wc -l
I would like now to exclude the cases when both pattern are matched on the same line, i.e I would like to use a XOR operator with grep.
How can I do this operation ? Maybe another trick is possible (I think about grep -v to exclude but this would be nice to do this operation in ony one command line with grep -E).
When you want to make such a special case, it is better to make use of awk:
$ awk '(/C_matrix/ && !/F_matrix/) || (!/C_matrix/ && !/F_matrix/)' file
Using GNU awk, you can use the bit-manipulation function xor:
$ awk 'xor(/C_matrix/,/F_matrix/)' file

How do I 'grep -c' and avoid printing files with zero '0' count

The command 'grep -c blah *' lists all the files, like below.
% grep -c jill *
file1:1
file2:0
file3:0
file4:0
file5:0
file6:1
%
What I want is:
% grep -c jill * | grep -v ':0'
file1:1
file6:1
%
Instead of piping and grep'ing the output like above, is there a flag to suppress listing files with 0 counts?
SJ
How to grep nonzero counts:
grep -rIcH 'string' . | grep -v ':0$'
-r Recurse subdirectories.
-I Ignore binary files (thanks #tongpu, warlock).
-c Show count of matches. Annoyingly, includes 0-count files.
-H Show file name, even if only one file (thanks #CraigEstey).
'string' your string goes here.
. Start from the current directory.
| grep -v ':0$' Remove 0-count files. (thanks #LaurentiuRoescu)
(I realize the OP was excluding the pipe trick, but this is what works for me.)
Just use awk. e.g. with GNU awk for ENDFILE:
awk '/jill/{c++} ENDFILE{if (c) print FILENAME":"c; c=0}' *

grep that match around the first match

I would like to grep a specific word 'foo' inside specific files, then get the N lines around my match and show only the blocks that contain a second grep.
I found this but it doesn't really work...
find . | grep -E '.*?\.(c|asm|mac|inc)$' | \
xargs grep --color -C3 -rie 'foo' | \
xargs -n1 --delimiter='--' | grep --color -l 'bar'
For instance I have the file 'a':
a
b
c
d
bar
f
foo
g
h
i
j
bar
l
The file b:
a
bar
c
d
e
foo
g
h
i
j
k
I expect this for grep -c2 on both files because bar is contained in the -c2 range of foo. I do not get any match for ./bar because bar is not in the range -c2 of foo...
--
./foo- bar
./foo- f
./foo- **foo**
./foo- g
./foo- h
--
Any ideas?
You could do this pretty simply with a "while read line" loop:
find -regextype posix-extended -regex "./file[a-z]" | while read line; do grep -nHC2 "foo" $line | grep --color bar; done
Output:
./filea-5-bar
./filec-46-... host pwns.me [94.23.120.252]: 451 4.7.1 Local bar
configuration error ...
In this example, I created the following files:
filea - your example a
fileb - your example b
filec - some random exim log output with foo and bar tossed in 2 lines apart
filed - the same exim log output, but with foo and bar tossed in 3 lines apart
You could also pipe the output after done, to alter the format:
; done | sed 's/-([0-9]{1,6})-/: line: \1 ::: /'
Formatted output
./filea: line: 5 ::: bar
./filec: line: 46 ::: ... host pwns.me [94.23.120.252]: 451 4.7.1 Local bar configuration error ...
I think I only understand the first line of your question and this does what I think you mean!
#!/bin/bash
N=2
pattern1=a
pattern2=z
matchinglines=$(awk -v p="$pattern1" '$0~p{print NR}' file) # Generate array of matching line numbers
for x in ${matchinglines[#]}
do
((start=x-N))
[[ $start -lt 1 ]] && start=1 # Avoid passing negative line nmumbers to sed
((end=x+N))
echo DEBUG: Checking block between lines $start and $end
sed -ne "${start},${end}p" file | grep -q "$pattern2"
[[ $? -eq 0 ]] && sed -ne "${start},${end}p" file
done
You need to set pattern1 and pattern2 at the start of the script. It basically does some awk to build an array of the line numbers that match your first pattern. Then it loops through the array and sets the start and end range to +/-N either side of each matching line number. It then uses sed to extraact that block and passes it through grep to see if it contains pattern2 printing it if it does. It may not be the most efficient, but it is easy enough to understand and maintain.
It assumes your file is called file
pipe it twice
grep "[^foo\n]" | grep "\n{ntimes}foo\n{ntimes}"

Resources