Grep digits after match - grep

I would like to grep digits inside a set of parentheses after a match.
Given foo.txt below,
foo: "32.1" bar: "42.0" misc: "52.3"
I want to extract the number after bar, 42.0.
The following line will match, but I'd like to extract the digit. I guess I could pipe the output back into grep looking for \d+.\d+, but is there a better way?
grep -o -P 'bar: "\d+.\d+"' foo.txt

One way is to use look ahead and look-behind assertions:
grep -o -P '(?<=bar: ")\d+.\d+(?=")'
Another is to use sed:
sed -e 's/.*bar: "\([[:digit:]]\+.[[:digit:]]\+\)".*/\1/'

You could use the below grep also,
$ echo 'foo: "32.1" bar: "42.0" misc: "52.3"' | grep -oP 'bar:\s+"\K[^"]*(?=")'
42.0

Related

grep for pattern with special character and output only matched string

Team,
I want to grep for a substring container - and then only output that string and not whole line. how can i? I know i can awk on space and pull using $ but want to know how to do in grep?
echo $test_pods_info | grep -F 'test-'
output
test-78ac951e-89a6-4199-87a4-db8a1b8b054f export-9b55f0d5-071d-431-1d2ux0-avexport-xavierisp-sjc4--a4dd85-102 1/1 Running 0 19h
expected output
test-78ac951e-89a6-4199-87a4-db8a1b8b054f
awk is more suitable for this as you want to get first field in a matching line:
awk '/test-/{print $1}' <<< "$taxIncluded"
test-78ac951e-89a6-4199-87a4-db8a1b8b054f
If you really want to use grep then this might be what you're looking for:
grep -o 'test-\S*' <<< "$taxIncluded"
or:
grep -o 'test-[^[:space:]]*' <<< "$taxIncluded"
Try
echo $test_pods_info | grep -o 'test-'
the -o option is:
show[ing] only the part of a line matching PATTERN
according to grep --help. Of course, this will only print test-, so you'll need to rework your regex:
grep -oE '(test).*[[:space:]]\b'
Figured it out..
echo $test_pods_info | grep -o "\test-\w*-\w*\-\w*\-\w*\-\w*"
outoput
test-78ac951e-89a6-4199-87a4-db8a1b8b054f
but i wish there is simple way. like \test-*\

print filename if several matches are present in file

I want to print the filename if only ALL the matches are present... on different lines
grep -l -w '10B\|01A\|gencode' */$a*filename.vcf
this prints out the filename, but not only if ALL three matches are present.
Would you consider to try awk? awk may solve it in following method,
awk '/10B/&&/01A/&&/gencode/{print FILENAME}' */$a*filename.vcf
try following, just edited your solution a bit.
grep -l '10B.*01A.*gencode' Input_file
With grep and its -P (Perl-Compatibility) option and positive lookahead regex (?=(regex)), to match patterns if in any order.
grep -lwP '(?=.*?10B)(?=.*?01A)(?=.*?gencode)' /path/to/infile
grep -l 'pattern1' files ... | xargs grep -l 'pattern2' | xargs grep -l 'pattern3'
From the grep manual:
-l, --files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match. (-l is specified by POSIX.)

grep combine -f and -E

I want to combine the content of a patter file with a regular expressions, i.e. grep -E -f.
The input file has the format
2 List_of_anthropologists<!!>Q1279970
3 List_of_Governors_of_Alabama<!!>Q558677
2027476 12th_Dalai_Lama<!!>Q25240
etc..
and the pattern file has the format:
13th_Dalai_Lama
5th_Dalai_Lama
etc...
I can make it work by manually putting in the pattern "13th_Dali_Lama"
grep -E "^(\d*)(?:\t)13th_Dalai_Lama" input_file
But how to I combine the -f option so that 13th_Dalai_Lama is replace by the lines in the pattern file?
With GNU grep, GNU sed and bash:
grep -f <(sed 's/.*/\\b&\\b/' pattern_file) input_file

repesent digits using regular expression

grep -w "ing_[0-9][0-9][0-9][0-9]"
The command mentioned above is working. But is there a short version of 4 digits?
This does not work:
grep -w "ing_[0-9]\+ {4}"
Grep by default use Basic Regular expressions. In BRE , you need to escape the curly braces so that it would consider the curly braces as repetition quantifier.
grep -w "ing_[0-9]\{4\}" file
Example:
$ echo 'ing_6786 says' | grep -w "ing_[0-9]{4}"
$ echo 'ing_6786 says' | grep -w "ing_[0-9]\{4\}"
ing_6786 says
If you are lucky and your grep supports modern (Perl) regular expressions, try -P argument
grep -wP "ing_[0-9]{4}"

How to grep -w for 2 words that might or might not occur in the same line?

I would need the combination of the 2 commands, is there a way to just grep once? Because the file may be really big, >1gb
$ grep -w 'word1' infile
$ grep -w 'word2' infile
I don't need them on the same line like grep for 2 words existing on the same line. I just need to avoid redundant iteration of the whole file
use this:
grep -E -w "word1|word2" infile
or
egrep -w "word1|word2" infile
It will match lines matching either word1, word2 or both.
From man grep:
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below).
Test
$ cat file
The fish [ate] the bird.
[This is some] text.
Here is a number [1001] and another [1201].
$ grep -E -w "is|number" file
[This is some] text.
Here is a number [1001] and another [1201].

Resources