grep combine -f and -E - grep

I want to combine the content of a patter file with a regular expressions, i.e. grep -E -f.
The input file has the format
2 List_of_anthropologists<!!>Q1279970
3 List_of_Governors_of_Alabama<!!>Q558677
2027476 12th_Dalai_Lama<!!>Q25240
etc..
and the pattern file has the format:
13th_Dalai_Lama
5th_Dalai_Lama
etc...
I can make it work by manually putting in the pattern "13th_Dali_Lama"
grep -E "^(\d*)(?:\t)13th_Dalai_Lama" input_file
But how to I combine the -f option so that 13th_Dalai_Lama is replace by the lines in the pattern file?

With GNU grep, GNU sed and bash:
grep -f <(sed 's/.*/\\b&\\b/' pattern_file) input_file

Related

How to grep with regex lookahead

I can't see what I'm missing in my grep command, can you?
http://regexr.com/5shri
echo "2021-05-09 15:38:56.888 T:1899877296 NOTICE: VideoPlayer::OpenFile:plugin://plugin.video.arteplussept/play/SHOW/069083-002-A" | grep -oE "\w+(?=\/play)/g" -
Expect: arteplussept
You need to
Use the PCRE regex engine, with -P option, not -E (which stands for POSIX ERE)
Remove /g, grep -o extracts all matches and there is no need to "embed" this modifier into the pattern
There is no need to escape /
So, you can just use
grep -oP '\w+(?=/play)'

print filename if several matches are present in file

I want to print the filename if only ALL the matches are present... on different lines
grep -l -w '10B\|01A\|gencode' */$a*filename.vcf
this prints out the filename, but not only if ALL three matches are present.
Would you consider to try awk? awk may solve it in following method,
awk '/10B/&&/01A/&&/gencode/{print FILENAME}' */$a*filename.vcf
try following, just edited your solution a bit.
grep -l '10B.*01A.*gencode' Input_file
With grep and its -P (Perl-Compatibility) option and positive lookahead regex (?=(regex)), to match patterns if in any order.
grep -lwP '(?=.*?10B)(?=.*?01A)(?=.*?gencode)' /path/to/infile
grep -l 'pattern1' files ... | xargs grep -l 'pattern2' | xargs grep -l 'pattern3'
From the grep manual:
-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match. (-l is specified by POSIX.)

grep - combining postive and negative expressions

I have come across examples of combining multiple search expressions like
grep -e 'phrase1|phrase2|phrase3'
but I am struggling with combining both positive and negative expressions in a search. I am looking to use grep to extract a list of file names from a directory where the file:
does not contain the text '[downloadedimages]'
AND
contains the text '[images]'
I tried the following but it throws a syntax error [-e: command not found]
grep -v -e '"\[downloadedimages\]"' | -e '"\[images\]"' -l /path/to/files
grep 'images' /path/to/files | grep -v 'downloadedimages'

Grep digits after match

I would like to grep digits inside a set of parentheses after a match.
Given foo.txt below,
foo: "32.1" bar: "42.0" misc: "52.3"
I want to extract the number after bar, 42.0.
The following line will match, but I'd like to extract the digit. I guess I could pipe the output back into grep looking for \d+.\d+, but is there a better way?
grep -o -P 'bar: "\d+.\d+"' foo.txt
One way is to use look ahead and look-behind assertions:
grep -o -P '(?<=bar: ")\d+.\d+(?=")'
Another is to use sed:
sed -e 's/.*bar: "\([[:digit:]]\+.[[:digit:]]\+\)".*/\1/'
You could use the below grep also,
$ echo 'foo: "32.1" bar: "42.0" misc: "52.3"' | grep -oP 'bar:\s+"\K[^"]*(?=")'
42.0

How to grep -w for 2 words that might or might not occur in the same line?

I would need the combination of the 2 commands, is there a way to just grep once? Because the file may be really big, >1gb
$ grep -w 'word1' infile
$ grep -w 'word2' infile
I don't need them on the same line like grep for 2 words existing on the same line. I just need to avoid redundant iteration of the whole file
use this:
grep -E -w "word1|word2" infile
or
egrep -w "word1|word2" infile
It will match lines matching either word1, word2 or both.
From man grep:
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below).
Test
$ cat file
The fish [ate] the bird.
[This is some] text.
Here is a number [1001] and another [1201].
$ grep -E -w "is|number" file
[This is some] text.
Here is a number [1001] and another [1201].

Resources