grep: constructing a regex pattern to exclude several groups - grep

I have a folder with three files:
$ ls
aaa.txt abc.txt def.txt
If I want to grep the output excluding the abc.txt file I can do:
$ ls | grep -v 'abc'
aaa.txt
def.txt
If I want to exclude two files I can do:
$ ls | grep -v 'abc' | grep -v 'def'
aaa.txt
But how can I do this using one regex and one grep invocation?
This does not work:
$ ls | grep -v '[(abc)(def)]'
neither does this:
$ ls | grep -v "abc|def"

Use the ERE(Extended Regular Expression) pattern for the alternation match | which is not enabled by default in BRE (which grep uses by default)
grep -vE "abc|def"
or use the extended grep, i.e. egrep which enables the ERE by default
egrep -v "abc|def"

Related

Combine multiple grep regular expression into one command

Consider this input file:
bam/pfg413T.GRCh38DH.target.bai
bam/pfg413T.GRCh38DH.target.bam
bam/pfg413T.GRCh38DH.target.bam
bam/pfg416G.GRCh38DH.target.bai
bam/pfg416G.GRCh38DH.target.bam
How can I combine the following multiple grep -E into one grep -E pipe ?
readlink -f exomesinglesample_out/bam/pfg* | grep -E 'pfg[0-9]*G' | grep -E 'bam$'

How to write noncapturing groups in egrep

The following command does not correctly capture the 16714 from 16714 ssh -f -N -T -R3300:localhost:22
egrep -o '^[^ ]+(?= .*[R]3300:localhost:22)'
(However swapping to grep does if you use the -P flag. I was expecting egrep to be able to handle this)
grep -P forces grep to use the Perl regexp engine.
egrep is the same as grep -E and it forces grep to use the ERE (extended regular expression) engine, that does not support lookahead.
You can find a quick reference of the differences between Perl and ERE (and others) here : http://www.greenend.org.uk/rjk/tech/regexp.html
To handle this with POSIX grep, you would use grep to isolate the lines of interest and then use cut to isolate the fields of interest:
$ echo "16714 ssh -f -N -T -R3300:localhost:22" | grep 'R3300:localhost:22' | cut -d' ' -f1
16714
Or, just use awk:
$ echo "16714 ssh -f -N -T -R3300:localhost:22" | awk '/R3300:localhost:22/{print $1}'
16714

why grep promt "Invalid range end"?

I have a file a:
$ cat a
abcd
kaka
when using the command:
$ grep -e '[a-d]' a
abcd
kaka
It works well, but why those command is not right?
$ grep -e '[\x61-\x74]' a
grep: Invalid range end
$ grep -e '[\u0061-\u0074]' a
grep: Invalid range end
Assuming that your version of grep supports PCRE ("Perl-compatible regular expressions"), you can try:
grep -P '[\x61-\x74]' a
This would return the expected output:
abcd
kaka

Combining many greps in a single grep call

I've a script where I've to check if a process is running by its name and I'm doing it using ps and grep. The problem is that I've to grep many things to avoid to find false positive.
By now, I've a grep chain that looks as follow:
ps -ef | grep -i $process_name | grep -i perl | grep -v do_all | grep -v grep
Four greps. Three of them are there to avoid false positive.
I would like to know if there's a way to avoid such 'piping chain' and use a single grep to achieve the same result.
Though some of you could answer that there are cleaner way to find out if a process exists, I would like the same to have an answer to this question, just to better understand the usage of the grep command.
There's no real reason to avoid chaining them, is there?
If you really wanted to you could combine them with | in egrep:
ps -ef | egrep -i "$process_name|perl" | egrep -v 'do_all|grep'
Here's one way using GNU awk:
ps -ef | awk -v process="$process_name" 'BEGIN { IGNORECASE=1 } $0 ~ process && /perl/ && !/do_all/'

Inspect stdout from the middle of chained apps

Consider this example chain:
cat foo.txt | grep -v foo | grep -v bar | grep -v baz
I'd like to inspect the contents stdout of the second grep as well as the resulting stdout:
cat foo.txt | grep -v foo | grep -v bar | UNKNOWN | grep -v baz
So I need a tool, UNKNOWN, that for instance dumps the contents of stdout to a file and also passes stdout along the chain.
Does the tool, UNKNOWN, exists (both Windows and Linux answers are relevant) ?
I think there's a thing call 'tee' that gives you that.
Update reflecting comment from Bob:
cat foo.txt | grep -v foo | grep -v bar | tee -a inspection.txt | grep -v baz
Unable to give it a shot, but like Gabriel and Bob pointed out, the command $ tee (man tee) will help you out. The tee command will take input and echo it to stdout, as well as files. As Bob said in his comment:
cat foo.txt | grep -v foo | grep -v bar | tee -a inspection.txt | grep -v baz
Will take the output from grep -v bar and put it to stdout, as well as inspection.txt. The -a flag causes it to append to inspection rather than create a whole new file.

Resources