Grep -v -f filename not working on one server - grep

I've been running a script to extract Exchange mail recipient addresses and update my local postfix relay_recipients table for years. Script comes from The Book of Postfix, has worked flawlessly until now.
The key line that is causing issues still works fine on one server, but gives me an empty result on the other.
cat /home/username/mailrecipients.txt extra_recipients | tr -d \" | tr , \\n | tr \; \\n | tr -d '\r' | awk -F\: '/(SMTP|smtp):/ {printf("%s\tOK\n",$2)}' | grep -vi --file=blacklist
The script combines some hardcoded addresses with the mail recipients files, strips out unneeded stuff, and generates a long list of email addresses with an OK and newline. Everything works perfectly up to the grep. On one server, piping the output to grep with inverse match of the patterns contained in blacklist file strips out the addresses I don't want and returns the rest correctly. On the other server, with the same version of grep, I get an empty result. The three files (blacklist, mailrecipients.txt, and extra_recipients) are identical on both servers.
Any ideas on what's happening here?
EDIT:
Output of the awk command before grep:
email1#mycompany.com OK
email2#mycompany.com OK
email3#mycompany.com OK
restricted#mycompany.com OK
blacklist file:
restricted
Expected result:
email1#mycompany.com OK
email2#mycompany.com OK
email3#mycompany.com OK
One server returns this correctly, and the other server returns an empty set.

Ok, I feel bad now, but the reason it isn't working is because the "blacklist" file on one server had an empty line at the bottom of the file I didn't see. Having a blank line in the pattern match file caused the entire thing to work differently than expected.
Once I edited the blacklist file and removed the empty line, everything works correctly again.

Related

grep not finding blank lines on a particular file

I preform the command grep '^$' myfile and receive no results.
In vi you can clearly see it list the following on line 2 of the file. In vi I set number and set list and this is what that line looks like.
2 $
The previous line terminates with a $ too. If I run it without the the ^ it returns every single line like you would expect.
I run the command on other files and it works, but not from files from a particular source. The file is ASCII text, with CRLF line terminators, so are the others.
Not sure what else I can look at on this type of file that would affect these results.
*Looking in notepad++ looking at the hidden characters the problematic has CRLF at the end of the lines blank or otherwise.
The non-problematic one is just LF.
Somewhere in there is the problem just finding it difficult to craft a grep statement that figures this out.
*Took the problematic file and used dos2unix and grep -En '^$' myfile works now. Too bad I can't be editing this file for my ultimate fix.
*In the end this is what worked for this file type.
grep --color=never -n '^[^[:print:]]' myfile

Find multiple files containing a certian string and list the files that don't containt the string

In the midst of building a site checker I have ran unto a problem, the client needs to check all of their pages for certain strings if they are included in the code and then list the files that do not have the code yet.
Tried with multiple grep commands with no success. The -v supposedly exports the inverted match of the results, but that does not happen. Currently I am missing the part of the code telling grep to only search in specific files (example files names code.php) in all sub folders.
With the current code it searches all the files even unnecessary ones.
grep -vrn '.' -e "SRING" > list.txt
I'd like to export a list of files (preferably that it only checks files in all sub folders with the same name) that do not posses the sting that I am looking for.
grep -c STRING files will give you a count of lines with STRING for each file.
You can optomize it somewhat with -m1 to stop after the first match.
You can pipe that to sed to grab the files with zero matches:
grep -cm1 STRING files | sed -n '/:0$/s/:0//p'
That gets you one file per line.
You can pipe that to xargs to merge it into a one-line list.
If your STRING is just that and not a regex, you could use the -F flag with grep to specify it's a fixed string, and that will also speed things up. So maybe...
grep -Fcm1 STRING files | sed -n '/:0$/s/:0//p' > list.txt
...in that case

How to grep for two words existing on the same line? [duplicate]

This question already has answers here:
Match two strings in one line with grep
(23 answers)
Closed 3 years ago.
How do I grep for lines that contain two input words on the line? I'm looking for lines that contain both words, how do I do that? I tried pipe like this:
grep -c "word1" | grep -r "word2" logs
It just stucks after the first pipe command.
Why?
Why do you pass -c? That will just show the number of matches. Similarly, there is no reason to use -r. I suggest you read man grep.
To grep for 2 words existing on the same line, simply do:
grep "word1" FILE | grep "word2"
grep "word1" FILE will print all lines that have word1 in them from FILE, and then grep "word2" will print the lines that have word2 in them. Hence, if you combine these using a pipe, it will show lines containing both word1 and word2.
If you just want a count of how many lines had the 2 words on the same line, do:
grep "word1" FILE | grep -c "word2"
Also, to address your question why does it get stuck : in grep -c "word1", you did not specify a file. Therefore, grep expects input from stdin, which is why it seems to hang. You can press Ctrl+D to send an EOF (end-of-file) so that it quits.
Prescription
One simple rewrite of the command in the question is:
grep "word1" logs | grep "word2"
The first grep finds lines with 'word1' from the file 'logs' and then feeds those into the second grep which looks for lines containing 'word2'.
However, it isn't necessary to use two commands like that. You could use extended grep (grep -E or egrep):
grep -E 'word1.*word2|word2.*word1' logs
If you know that 'word1' will precede 'word2' on the line, you don't even need the alternatives and regular grep would do:
grep 'word1.*word2' logs
The 'one command' variants have the advantage that there is only one process running, and so the lines containing 'word1' do not have to be passed via a pipe to the second process. How much this matters depends on how big the data file is and how many lines match 'word1'. If the file is small, performance isn't likely to be an issue and running two commands is fine. If the file is big but only a few lines contain 'word1', there isn't going to be much data passed on the pipe and using two command is fine. However, if the file is huge and 'word1' occurs frequently, then you may be passing significant data down the pipe where a single command avoids that overhead. Against that, the regex is more complex; you might need to benchmark it to find out what's best — but only if performance really matters. If you run two commands, you should aim to select the less frequently occurring word in the first grep to minimize the amount of data processed by the second.
Diagnosis
The initial script is:
grep -c "word1" | grep -r "word2" logs
This is an odd command sequence. The first grep is going to count the number of occurrences of 'word1' on its standard input, and print that number on its standard output. Until you indicate EOF (e.g. by typing Control-D), it will sit there, waiting for you to type something. The second grep does a recursive search for 'word2' in the files underneath directory logs (or, if it is a file, in the file logs). Or, in my case, it will fail since there's neither a file nor a directory called logs where I'm running the pipeline. Note that the second grep doesn't read its standard input at all, so the pipe is superfluous.
With Bash, the parent shell waits until all the processes in the pipeline have exited, so it sits around waiting for the grep -c to finish, which it won't do until you indicate EOF. Hence, your code seems to get stuck. With Heirloom Shell, the second grep completes and exits, and the shell prompts again. Now you have two processes running, the first grep and the shell, and they are both trying to read from the keyboard, and it is not determinate which one gets any given line of input (or any given EOF indication).
Note that even if you typed data as input to the first grep, you would only get any lines that contain 'word2' shown on the output.
Footnote:
At one time, the answer used:
grep -E 'word1.*word2|word2.*word1' "$#"
grep 'word1.*word2' "$#"
This triggered the comments below.
you could use awk. like this...
cat <yourFile> | awk '/word1/ && /word2/'
Order is not important. So if you have a file and...
a file named , file1 contains:
word1 is in this file as well as word2
word2 is in this file as well as word1
word4 is in this file as well as word1
word5 is in this file as well as word2
then,
/tmp$ cat file1| awk '/word1/ && /word2/'
will result in,
word1 is in this file as well as word2
word2 is in this file as well as word1
yes, awk is slower.
The main issue is that you haven't supplied the first grep with any input. You will need to reorder your command something like
grep "word1" logs | grep "word2"
If you want to count the occurences, then put a '-c' on the second grep.
git grep
Here is the syntax using git grep combining multiple patterns using Boolean expressions:
git grep -e pattern1 --and -e pattern2 --and -e pattern3
The above command will print lines matching all the patterns at once.
If the files aren't under version control, add --no-index param.
Search files in the current directory that is not managed by Git.
Check man git-grep for help.
See also:
How to use grep to match string1 AND string2?
Check if all of multiple strings or regexes exist in a file.
How to run grep with multiple AND patterns?
For multiple patterns stored in the file, see: Match all patterns from file at once.
You cat try with below command
cat log|grep -e word1 -e word2
Use grep:
grep -wE "string1|String2|...." file_name
Or you can use:
echo string | grep -wE "string1|String2|...."

grep output different on two servers

I am trying to create a script, and one part requires showing lines with numeric values.
My basic syntax is:
echo $i | grep [0-9]
For example, I set i=12345, it should output 12345.
But on one server, it doesn't output anything (exactly the same commands).
I do not know how to Google this issue, I have tried "grep output different on other server", to no avail.
When using a regexp, either use egrep or grep -e to make sure the pattern is not treated as a plain string.
maybe it's a shell issue? some shells interpert [] differently
try
echo "1234" | grep "[0-9]"
(with quotes)
also try
grep --version
to see if there is a different grep version

How to make grep stop at first match on a line?

Well, I have a file test.txt
#test.txt
odsdsdoddf112 test1_for_grep
dad23392eeedJ test2 for grep
Hello World test
garbage
I want to extract strings which have got a space after them. I used following expression and it worked
grep -o [[:alnum:]]*.[[:blank:]] test.txt
Its output is
odsdsdoddf112
dad23392eeedJ
test2
for
Hello
World
But problem is grep prints all the strings that have got space after them, where as I want it to stop after first match on a line and then proceed to second line.
Which expression should I use here, in order to make it stop after first match and move to next line?
This problem may be solved with gawk or some other tool, but I will appreciate a solution which uses grep only.
Edit
I using GNU grep 2.5.1 on a Linux system, if that is relevant.
Edit
With the help of the answers given below, I tried my luck with
grep -o ^[[:alnum:]]* test.txt
grep -Eo ^[[:alnum:]]+ test.txt
and both gave me correct answers.
Now what surprises me is that I tried using
grep -Eo "^[[:alnum:]]+[[:blank:]]" test.txt
as suggested here but didn't get the correct answer.
Here is the output on my terminal
odsdsdoddf112
dad23392eeedJ
test2
for
Hello
World
But comments from RichieHindle and Adrian Pronk, shows that they got correct output on their systems. Anyone with some idea that why I too am not getting the same result on my system. Any idea? Any help will be appreciated.
Edit
Well, it seems that grep 2.5.1 has some bug because of which my output wasn't correct. I installed grep 2.5.4, now it is working correctly. Please see this link for details.
If you're sure you have no leading whitespace, add a ^ to match only at the start of a line, and change the * to a + to match only when you have one or more alphanumeric characters. (That means adding -E to use extended regular expressions).
grep -Eo "^[[:alnum:]]+[[:blank:]]" test.txt
(I also removed the . from the middle; I'm not sure what that was doing there?)
As the questioner discovered, this is a bug in versions of GNU grep prior to 2.5.3. The bug allows a caret to match after the end of a previous match, not just at beginning of line.
This bug is still present in other versions of grep, for instance in Mac OS X 10.9.4.
There isn't a universal workaround, but in the some examples, like non-spaces followed by a space, you can often get the desired behavior by leaving off the delimiter. That is, search for '[^ ]*' rather than '[^ ]* '.
grep -oe "^[^ ]* " test.txt
If we want to extract all meaningful input before garbage and actually stop on first match then -B NUM, --before-context=NUM option may be useful to "print NUM lines of leading context before matching lines".
Example:
grep --before-context=999999 "Hello World test"

Resources