Combine -v option for grep with -A option - grep

I'd like to ask is it possible to combine somehow -v with -A?
I have example file:
abc
1
2
3
ACB
def
abc
1
2
3
ABC
xyz
with -A I can see the parts I want to "cut":
$ grep abc -A 4 grep_v_test.txt
abc
1
2
3
ACB
--
abc
1
2
3
ABC
it there some option to specify something to see only
def
xyz
?
I found this answer - Combining -v flag and -A flag in grep but it is not working for me, I tried
$ sed -e "/abc/{2;2;d}" grep_v_test.txt
sed: -e expression #1, char 8: unknown command: `;'
also
$ sed "/abc/2d" grep_v_test.txt
sed: -e expression #1, char 6: unknown command: `2'
or
$ sed "/abc/+2d" grep_v_test.txt
sed: -e expression #1, char 6: unknown command: `+'
Sed version is:
$ sed --version
GNU sed version 4.2.1
edit1:
Based on comment I experimented a little bit with both solution, but it is not working as I want to
for grep -v -A 1 abc I would expect line abc and 1 to be removed, but the rest will be printed awk 'c&&!--c; /abc/ {c=2}' grep_v_test.txt prints just the line containing 2, which is not what I wanted.
Very similar it is with sed
$ sed -n '/abc/{n;n;p}' grep_v_test.txt
2
2
edit2:
It seems, I'm not able to describe it properly, let me try again.
What grep -A N abc file does is to print N lines after abc. I want to remove what grep -A will show, so in a file
abc
1
2
3
ACB
def
DEF
abc
1
2
3
ABC
xyz
XYZ
I'll just remove the part abc to ABC and I'll print the rest:
abc
1
2
3
ACB
def
DEF
abc
1
2
3
ABC
xyz
XYZ
so 4 lines will remain... The awk solution prints just def and xyz and skips DEF and XYZ...

To skip 5 lines of context starting with the initial matching line is:
$ awk '/abc/{c=5} c&&c--{next} 1' file
def
xyz
See Extract Nth line after matching pattern for other related scripts.
wrt the comments below, here's the difference between this answer and #fedorqui's answer:
$ cat file
now is the Winter
of our discontent
abc
1
2
bar
$ awk '/abc/{c=3} c&&c--{next} 1' file
now is the Winter
of our discontent
bar
$ awk '/abc/ {c=0} c++>2' file
bar
See how the #fedorqui's script unconditionally skips the first 2 lines of the file?

If I understand you properly, you want to print all the lines that occur 4 lines after a given match.
For this you can tweak the solutions in Extract Nth line after matching pattern and say:
$ awk '/abc/ {c=0} c++>4' file
def
DEF
xyz
XYZ

Related

Why these patterns return same result?

I saw this question: count (non-blank) lines-of-code in bash
I understand this pattern is correct.
grep -vc ^$ filename
Why this pattern returns same result?
grep -c '[^ ]' filename
What is trick in '[^ ]'?
$ printf 'foo 123\n \nxyz\n\t\n' > ip.txt
$ cat -T ip.txt
foo 123
xyz
^I
$ grep -vc '^$' ip.txt
4
$ grep -c '[^ ]' ip.txt
3
$ grep -c '[^[:blank:]]' ip.txt
2
grep -c '[^ ]' counts any line that has a non-space character. For example, foo 123 will be counted since alphabets are not space characters. So, which one to use depends on whether a line containing only space characters should be counted or not.

How can I list only directory names, with no trailing "/"?

by doing the following command in the folder
ls -d */ | cut -f1 -d'/'
I get entries like:
env1
env2
env3
env4
how I can use cat/grep or yq/jq or any other alternative command(s) instead of the above command?
for dir in */; do
echo "${dir%/}"
done
There are several options. You can use the tree command with the options:
# d: list only directories
# i: no print of indention line
# L: max display depth of the directory tree
tree -di -L 1 "$(pwd)"
Or you can also use the grep command to get the directories and the command awk:
# F: input field separator
# $9: print the ninth column of the output
ls -l | grep "^d" | awk -F" " '{print $9}'
Or you can use the sed command to remove the slash:
# structure: s|regexp|replacement|flags
# g: apply the replacement to all matches to the regexp, not just the first
ls -d */ | sed 's|[/]||g'
I found this solutions in this post.

How to grep lines non-repeatedly for same command?

I have a space-separated file that looks like this:
$ cat in_file
GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1 Chal_sti_synt_C
GCF_000046845.1_ASM4684v1_protein.faa WP_004927566.1 Chal_sti_synt_C
GCF_000046845.1_ASM4684v1_protein.faa WP_004919950.1 FAD_binding_3
GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1 FAD_binding_3
I am using the following shell script utilizing grep to search for strings:
$ cat search_script.sh
grep "GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1" Pfam_anntn_temp.txt
grep "GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1" Pfam_anntn_temp.txt
The problem is that I want each grep command to return only the first instance of the string it finds exclusive of the previous identical grep command's output.
I need an output which would look like this:
$ cat out_file
GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1 Chal_sti_synt_C
GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1 FAD_binding_3
in which line 1 is exclusively the output of the first grep command and line 2 is exclusively the output of the second grep command. How do I do it?
P.S. I am running this on a big file (>125,000 lines). So, search_script.sh is mostly composed of unique grep commands. It is the identical commands' execution that is messing up my downstream analysis.
I'm assuming you are generating search_script.sh automatically from the contents of in_file. If you can count how many times you'll repeat the same grep command you can just use grep once and use head, for example if you know you'll be using it 2 times:
grep "foo" bar.txt | head -2
Will output the first 2 occurrences of "foo" in bar.txt.
If you have to do the grep commands separately, for example if you have other code in between the grep commands, you can mix head and tail:
grep "foo" bar.txt | head -1 | tail -1
Some other commands...
grep "foo" bar.txt | head -2 | tail -1
head -n displays the first n lines of the input
tail -n displays the last n lines of the input
If you really MUST always use the same command, but ensure that the outputs always differ, the only way I can think of to achieve this is using temporary files and a complex sequence of commands:
cat foo.bar.txt.tmp 2>&1 | xargs -I xx echo "| grep -v \\'xx\\' " | tr '\n' ' ' | xargs -I xx sh -c "grep 'foo' bar.txt xx | head -1 | tee -a foo.bar.txt.tmp"
So to explain this command, given foo as a search string and bar.txt as the filename, then foo.bar.txt.tmp is a unique name for a temporary file. The temporary file will hold the strings that have already been output:
cat foo.bar.txt.tmp 2>&1 : outputs the contents of the temporary file. If none is present, will output an error message to stdout, (important because if the output was empty the rest of the command wouldn't work.)
xargs -I xx echo "| grep -v \\'xx\\' " adds | grep -v to the start of each line in the temporary file, grep -v something excludes lines that include something.
tr '\n' ' ' replaces newlines with spaces, to have on a single string a sequence of grep -vs.
xargs -I xx sh -c "grep 'foo' bar.txt xx | head -1 | tee -a foo.bar.txt.tmp" runs a new command, grep 'foo' bar.txt xx | head -1 | tee -a foo.bar.txt.tmp, replacing xx with the previous output. xx should be the sequence of grep -vs that exclude previous outputs.
head -1 makes sure only one line is output at a time
tee -a foo.bar.txt.tmp appends the new output to the temporary file.
Just be sure to clear the temporary files, rm *.tmp, at the end of your script.
If I am getting question right and you want to remove duplicates based on last field of each line then try following(this should be easy task for awk).
awk '!a[$NF]++' Input_file

Why do "docker run -t" outputs include \r in the command output?

I'm using Docker client Version: 18.09.2.
When I run start a container interactively and run a date command, then pipe its output to hexdump for inspection, I'm seeing a trailing \n as expected:
$ docker run --rm -i -t alpine
/ # date | hexdump -c
0000000 T h u M a r 7 0 0 : 1 5
0000010 : 0 6 U T C 2 0 1 9 \n
000001d
However, when I pass the date command as an entrypoint directly and run the container, I get a \r \n every time there's a new line in the output.
$ docker run --rm -i -t --entrypoint=date alpine | hexdump -c
0000000 T h u M a r 7 0 0 : 1 6
0000010 : 1 9 U T C 2 0 1 9 \r \n
000001e
This is weird.
It totally doesn't happen when I omit -t (not allocating any TTY):
docker run --rm -i --entrypoint=date alpine | hexdump -c
0000000 T h u M a r 7 0 0 : 1 7
0000010 : 3 0 U T C 2 0 1 9 \n
000001d
What's happening here?
This sounds dangerous, as I use docker run command in my scripts, and if I forget to omit -t from my scripts, the output I'll collect from docker run command will have invisible/non-printible \r characters which can cause all sorts of issues.
tldr; This is a tty default behaviour and unrelated to docker. Per the ticket filed on github about your exact issue.
Quoting the relevant comments in that ticket:
Looks like this is indeed TTY by default translates newlines to CRLF
$ docker run -t --rm debian sh -c "echo -n '\n'" | od -c
0000000 \r \n
0000002
disabling "translate newline to carriage return-newline" with stty -onlcr correctly gives;
$ docker run -t --rm debian sh -c "stty -onlcr && echo -n '\n'" | od -c
0000000 \n
0000001
Default TTY options seem to be set by the kernel ... On my linux host it contains:
/*
* Defaults on "first" open.
*/
#define TTYDEF_IFLAG (BRKINT | ISTRIP | ICRNL | IMAXBEL | IXON | IXANY)
#define TTYDEF_OFLAG (OPOST | ONLCR | XTABS)
#define TTYDEF_LFLAG (ECHO | ICANON | ISIG | IEXTEN | ECHOE|ECHOKE|ECHOCTL)
#define TTYDEF_CFLAG (CREAD | CS7 | PARENB | HUPCL)
#define TTYDEF_SPEED (B9600)
ONLCR is indeed there.
When we go looking at the ONLCR flag documentation, we can see that:
[-]onlcr: translate newline to carriage return-newline
To again quote the github ticket:
Moral of the story, don't use -t unless you want a TTY.
TTY line endings are CRLF, this is not Docker's doing.

why once have empty line, grep -F -f can't work correctly?

There are file a and b, and want to find common lines and diff lines.
➜ ~ cat a <(echo) b
1
2
3
4
5
1
2
a
4
5
#find common lines
➜ ~ grep -F -f a b
1
2
4
5
#find b-a
➜ ~ grep -F -v -f a b
a
everything is ok, but when have empty line in one file, the grep can't work, see below
# add an empty line in file a
➜ ~ cat a
1
2
3
4
5
# content a is not common
➜ ~ grep -F -f a b
1
2
a
4
5
# b-a is nothing
➜ ~ grep -F -v -f a b
why is so, why once have empty line, grep can't work correctly?
in addition, use grep to find common elements have another problem, e.g.
➜ ~ cat a <(echo) b
1
2
3
4
5
6
1
2
a
4
5
6_id
➜ ~ grep -F -f a b
1
2
4
5
6_id
Can you use comm and diff instead of grep?
to find common lines use:
comm -12 a b
to find diff line:
diff a b

Resources