Filter lines using grep -v - grep

I am trying to filter out lines that don't contain the file names as below using the following command but I am not sure why line with permission denied keeps coming in my result. It should be gone when I have used grep -v "total|denied".
wc -l *.* | egrep -v "total|denied" | sort -nr -k1,1
wc: host.save: Permission denied
33301 apache-maven-3.5.3-bin.tar.gz
14149 jenkins-cli.jar
240 examples.desktop
19 list.py
19 interview_GL.sh
17 lines.txt
7 number.py

Only stdout gets passed to the pipe into grep but those error messages are on stderr
You can either forward stderr to /dev/null or send them to stdout aswell
Send errors to /dev/null:
wc -l * 2>/dev/null
Redirect errors to stdout:
wc -l * 2>&1 | grep -v dir

You obviously aren't allowed to read host.save file's contents, therefore the error coming from the first command.
Have you tried muting the errors instead?
wc -l *.* 2>/dev/null | egrep -v "total|denied" | sort -nr -k1,1

Related

Grep redirection is pulling more information that I want in log.txt

I want the output of the sed file edit to go into my log file name d_selinuxlog.txt. Currently, grep outputs the specified string as well as 3 other strings above and below in the edited file.
#!/bin/bash
{ getenforce;
sed -i s/SELINUX=enforcing/SELINUX=disabled /etc/selinux/config;
grep "SELINUX=*" /etc/selinux/config > /home/neb/scropts/logs/d_selinuxlog.txt;
setenforce 0;
getenforce; }
I want to be seeing just SELINUX=disabled in the log file
All the lines with the lines SELINUX are going to match, even the commented ones, so, you need to omit that ones, and the * from the match.
grep "SELINUX=" /etc/selinux/config | grep -v "#"
This is my output
17:52:07 alvaro#lykan /home/alvaro
$ grep "SELINUX=" /etc/selinux/config | grep -v "#"
SELINUX=disabled
17:52:22 alvaro#lykan /home/alvaro

How to grep lines non-repeatedly for same command?

I have a space-separated file that looks like this:
$ cat in_file
GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1 Chal_sti_synt_C
GCF_000046845.1_ASM4684v1_protein.faa WP_004927566.1 Chal_sti_synt_C
GCF_000046845.1_ASM4684v1_protein.faa WP_004919950.1 FAD_binding_3
GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1 FAD_binding_3
I am using the following shell script utilizing grep to search for strings:
$ cat search_script.sh
grep "GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1" Pfam_anntn_temp.txt
grep "GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1" Pfam_anntn_temp.txt
The problem is that I want each grep command to return only the first instance of the string it finds exclusive of the previous identical grep command's output.
I need an output which would look like this:
$ cat out_file
GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1 Chal_sti_synt_C
GCF_000046845.1_ASM4684v1_protein.faa WP_004920342.1 FAD_binding_3
in which line 1 is exclusively the output of the first grep command and line 2 is exclusively the output of the second grep command. How do I do it?
P.S. I am running this on a big file (>125,000 lines). So, search_script.sh is mostly composed of unique grep commands. It is the identical commands' execution that is messing up my downstream analysis.
I'm assuming you are generating search_script.sh automatically from the contents of in_file. If you can count how many times you'll repeat the same grep command you can just use grep once and use head, for example if you know you'll be using it 2 times:
grep "foo" bar.txt | head -2
Will output the first 2 occurrences of "foo" in bar.txt.
If you have to do the grep commands separately, for example if you have other code in between the grep commands, you can mix head and tail:
grep "foo" bar.txt | head -1 | tail -1
Some other commands...
grep "foo" bar.txt | head -2 | tail -1
head -n displays the first n lines of the input
tail -n displays the last n lines of the input
If you really MUST always use the same command, but ensure that the outputs always differ, the only way I can think of to achieve this is using temporary files and a complex sequence of commands:
cat foo.bar.txt.tmp 2>&1 | xargs -I xx echo "| grep -v \\'xx\\' " | tr '\n' ' ' | xargs -I xx sh -c "grep 'foo' bar.txt xx | head -1 | tee -a foo.bar.txt.tmp"
So to explain this command, given foo as a search string and bar.txt as the filename, then foo.bar.txt.tmp is a unique name for a temporary file. The temporary file will hold the strings that have already been output:
cat foo.bar.txt.tmp 2>&1 : outputs the contents of the temporary file. If none is present, will output an error message to stdout, (important because if the output was empty the rest of the command wouldn't work.)
xargs -I xx echo "| grep -v \\'xx\\' " adds | grep -v to the start of each line in the temporary file, grep -v something excludes lines that include something.
tr '\n' ' ' replaces newlines with spaces, to have on a single string a sequence of grep -vs.
xargs -I xx sh -c "grep 'foo' bar.txt xx | head -1 | tee -a foo.bar.txt.tmp" runs a new command, grep 'foo' bar.txt xx | head -1 | tee -a foo.bar.txt.tmp, replacing xx with the previous output. xx should be the sequence of grep -vs that exclude previous outputs.
head -1 makes sure only one line is output at a time
tee -a foo.bar.txt.tmp appends the new output to the temporary file.
Just be sure to clear the temporary files, rm *.tmp, at the end of your script.
If I am getting question right and you want to remove duplicates based on last field of each line then try following(this should be easy task for awk).
awk '!a[$NF]++' Input_file

How come I don't get all the lines with grep and grep -v

Why does grep -v POLYGON remove many more lines than those matching grep POLYGON?
$ cat BOUNDARIES3D_LV03.nt | grep -v POLYGON | wc
249 782 137001
$ cat BOUNDARIES3D_LV03.nt | grep POLYGON | wc
2441 2753697 51833677
$ cat BOUNDARIES3D_LV03.nt | wc
73078 2975809 91746795
Is this a bug in grep (using: grep (GNU grep) 2.23) or am I misunderstanding something?
Update
It seems that grep aborts at the first matching line containing an invalid character.
The problem was that grep aborts at the first line containing a byte sequence that doesn't evaluate to a character in the current encoding. The following resolved the issue for me:
export LC_ALL="en_US.UTF-8"

trying to grep '--string' fails

I'm trying to grep for a string that starts with "--"
for some reason it counted as special character, but even when trying to use -F then grep gives me bad syntax:
[root#pc-01 /]# grep -F --restore .
-bash: --restore: command not found
any tips?
Thanks.
Try following.
grep -F -- --restore filename
You can escape the first - :
Without escaping:
[root#TIAGO-TEST2 tmp]# echo '--aa --bb --cc' | grep -o '--b'
grep: option '--b' is ambiguous; possibilities: '--basic-regexp' '--binary' '--byte-offset' '--binary-files' '--before-context'
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
Escaping:
[root#TIAGO-TEST2 tmp]# echo '--aa --bb --cc' | grep -o '\--b'
--b

view content of files from grep -L

I use grep -L to get a list of files that do not contain a certain string. How can I see the content of those files? Just like:
grep -L "pattern" | cat
You can use xargs:
grep -L "pattern" | xargs cat
As read in man xargs --> build and execute command lines from standard input. So it will cat to those file names that grep -L returns.
You can use cat and use the output of grep -L...
cat $(grep -L "pattern" *.files )

Resources