Automatically ignore files in grep - grep

Is there any way I could use grep to ignore some files when searching something, something equivalent to svnignore or gitignore? I usually use something like this when searching source code.
grep -r something * | grep -v ignore_file1 | grep -v ignore_file2
Even if I could set up an alias to grep to ignore these files would be good.

--exclude option on grep will also work:
grep perl * --exclude=try* --exclude=tk*
This searches for perl in files in the current directory excluding files beginning with try or tk.

You might also want to take a look at ack which, among many other features, by default does not search VCS directories like .svn and .git.

find . -path ./ignore -prune -o -exec grep -r something {} \;
What that does is find all files in your current directory excluding the directory (or file) named "ignore", then executes the command grep -r something on each file found in the non-ignored files.

Use shell expansion
shopt -s extglob
for file in !(file1_ignore|file2_ignore)
do
grep ..... "$file"
done

I thinks grep does not have filename filtering.
To accomplish what you are trying to do, you can combine find, xargs, and grep commands.
My memory is not good, so the example might not work:
find -name "foo" | xargs grep "pattern"
Find is flexible, you can use wildcards, ignore case, or use regular expressions.
You may want to read manual pages for full description.
after reading next post, apparently grep does have filename filtering.

Here's a minimalistic version of .gitignore. Requires standard utils: awk, sed (because my awk is so lame), egrep:
cat > ~/bin/grepignore #or anywhere you like in your $PATH
egrep -v "`awk '1' ORS=\| .grepignore | sed -e 's/|$//g' ; echo`"
^D
chmod 755 ~/bin/grepignore
cat >> ./.grepignore #above set to look in cwd
ignorefile_1
...
^D
grep -r something * | grepignore
grepignore builds a simple alternation clause:
egrep -v ignorefile_one|ignorefile_two
not incredibly efficient, but good for manual use

Related

Combine grep -v with grep -r?

I want to remove an entire line of text from all files in a given directory. I know I can use grep -v foo filename to do this one file at a time. And I know I can use grep -r foo to search recursively through a directory. How do I combine these commands to remove a given line of text from all files in a directory?
The UNIX command to find files is named find, not grep. Forget you ever heard of grep -r as it's just a bad idea, here's the right way to find files and perform some action on them:
find . -type f -print | xargs sed -i '/badline/d'
Try something like:
grep -vlre 'foo' . | xargs sed -i 's/pattern/replacement/g'
Broken down:
grep:
-v 'Inverse match'
-l 'Show filename'
-r 'Search recursively'
-e 'Extended pattern search'
xargs: For each entry perform
sed -i: replace inline
I think this would work:
grep -ilre 'Foo' . | xargs sed -i 'extension' 'Foo/d'
Where 'extension' refers to the addition to the file name. It will make a copy of the original file with the extension you designated and the modified file will have the original filename. I added -i in case you require it to be case insensitive.
modified file1 becomes "file1"
original file1 becomes "file1extension"
invalid command code ., despite escaping periods, using sed
One of the responses suggests that the newer version of sed's -i option in OSX is slightly different so you need to add an extension. The file is being interpreted as a command, which is why you are seeing that error.

grep with --include and --exclude

I want to search for a string foo within the app directory, but excluding any file which contains migrations in the file name. I expected this grep command to work
grep -Ir --include "*.py" --exclude "*migrations*" foo app/
The above command seems to ignore the --exclude filter. As an alternative, I can do
grep -Ir --include "*.py" foo app/ | grep -v migrations
This works, but this loses highlighting of foo in the results. I can also bring find into the mix and keep my highlighting.
find app/ -name "*.py" -print0 | xargs -0 grep --exclude "*migrations*" foo
I'm just wondering if I'm missing something about the combination of command line parameters to grep or if they simply don't work together.
I was looking for a term on a .py file, but didn't want migration files to be scanned, so what I found (for grep 2.10) was the following (I hope this helps):
grep -nR --include="*.py" --exclude-dir=migrations whatever_you_are_looking_for .
man grep says:
--include=GLOB
Search only files whose base name matches GLOB (using wildcard matching as described under
--exclude).
because it says "only" there, i'm guessing that your --include statment is overriding your --exclude statement.

How can I have grep not print out 'No such file or directory' errors?

I'm grepping through a large pile of code managed by git, and whenever I do a grep, I see piles and piles of messages of the form:
> grep pattern * -R -n
whatever/.git/svn: No such file or directory
Is there any way I can make those lines go away?
You can use the -s or --no-messages flag to suppress errors.
-s, --no-messages suppress error messages
grep pattern * -s -R -n
If you are grepping through a git repository, I'd recommend you use git grep. You don't need to pass in -R or the path.
git grep pattern
That will show all matches from your current directory down.
Errors like that are usually sent to the "standard error" stream, which you can pipe to a file or just make disappear on most commands:
grep pattern * -R -n 2>/dev/null
I have seen that happening several times, with broken links (symlinks that point to files that do not exist), grep tries to search on the target file, which does not exist (hence the correct and accurate error message).
I normally don't bother while doing sysadmin tasks over the console, but from within scripts I do look for text files with "find", and then grep each one:
find /etc -type f -exec grep -nHi -e "widehat" {} \;
Instead of:
grep -nRHi -e "widehat" /etc
I usually don't let grep do the recursion itself. There are usually a few directories you want to skip (.git, .svn...)
You can do clever aliases with stances like that one:
find . \( -name .svn -o -name .git \) -prune -o -type f -exec grep -Hn pattern {} \;
It may seem overkill at first glance, but when you need to filter out some patterns it is quite handy.
Have you tried the -0 option in xargs? Something like this:
ls -r1 | xargs -0 grep 'some text'
Use -I in grep.
Example: grep SEARCH_ME -Irs ~/logs.
I redirect stderr to stdout and then use grep's invert-match (-v) to exclude the warning/error string that I want to hide:
grep -r <pattern> * 2>&1 | grep -v "No such file or directory"
I was getting lots of these errors running "M-x rgrep" from Emacs on Windows with /Git/usr/bin in my PATH. Apparently in that case, M-x rgrep uses "NUL" (the Windows null device) rather than "/dev/null". I fixed the issue by adding this to .emacs:
;; Prevent issues with the Windows null device (NUL)
;; when using cygwin find with rgrep.
(defadvice grep-compute-defaults (around grep-compute-defaults-advice-null-device)
"Use cygwin's /dev/null as the null-device."
(let ((null-device "/dev/null"))
ad-do-it))
(ad-activate 'grep-compute-defaults)
One easy way to make grep return zero status all the time is to use || true
→ echo "Hello" | grep "This won't be found" || true
→ echo $?
0
As you can see the output value here is 0 (Success)

use grep to return a list of files, given multiple keywords (like google returns a list of webpages)

I need to find ALL files that have multiple keywords anywhere in the file (not necessarily on the same line), given a starting directory like ~/. Does "grep -ro" do this?
(I'm using Unix, Mac OSX 10.4)
You can use the -l option to get a list of filenames with matches, so it's just a matter of finding all of the files that have the first keyword and then filtering that list down to the files that also have the second keyword:
grep -rl first_keyword basedir | xargs grep -l second_keyword
To search just *.txt
find ~/. -name "*.txt" | xargs grep -l first_keyword | xargs grep -l second_keyword
Thanks Adam!

Using grep and xargs, avoiding 'unterminated quote' error

I'm getting frustrated enough that I figured it was time to ask a question.
I'm trying to replace an email address across a website that is hard coded into 1000's of pages. It's on a FreeBSD 6.3 server.
Here is the command I am using:
grep -R --files-with-matches 'Email\#domain.com' . | sort | uniq | xargs perl -pi -e 's/Email\#domain.com/Email\#newdomain.com/' *.html
And here is the error that I keep getting:
xargs: unterminated quote
Oddly enough, when I run that command on a test case of 3 files (in a nested structure) it works just fine. I've been googling and most solutions seem to deal with adding a -print0 after the . and a -0 after the xargs. However, this yields a different set of errors that lead me to believe I'm putting things in the wrong places.
thanks in advance for your help
Pax is correct. I would further correct it to something like:
grep -R --files-with-matches 'Email\#domain.com' . -print0 | xargs -0 perl -pi -e 's/Email\#domain.com/Email\#newdomain.com/'
EDIT:
Thanks to kcwu, this is the full FreeBSD:
grep -R --files-with-matches 'Email\#domain.com' . --null | xargs -0 perl -pi -e 's/Email\#domain.com/Email\#newdomain.com/'
Note that I've removed sort and uniq. --files-without-match is documented to "stop on the first match" so you will not get duplicate files. -print0 and -0 ensure (and handle) a null-terminated file list, which is vital, because POSIX allows filenames to contain newlines.
Note that I don't know perl, but I'm assuming that part's roughly equivalent to:
sed -i s/Email\#domain.com/Email\#newdomain.com/g
Why are you giving a list of HTML files to xargs? That program takes its file list from the pipeline (output of grep).
Use GNU Parallel:
grep -R --files-with-matches 'Email\#domain.com' . | sort | uniq | parallel -q perl -pi -e 's/Email\#domain.com/Email\#newdomain.com/g'
Watch the intro video to learn more: http://www.youtube.com/watch?v=OpaiGYxkSuQ

Resources