grep with --include and --exclude - grep

I want to search for a string foo within the app directory, but excluding any file which contains migrations in the file name. I expected this grep command to work
grep -Ir --include "*.py" --exclude "*migrations*" foo app/
The above command seems to ignore the --exclude filter. As an alternative, I can do
grep -Ir --include "*.py" foo app/ | grep -v migrations
This works, but this loses highlighting of foo in the results. I can also bring find into the mix and keep my highlighting.
find app/ -name "*.py" -print0 | xargs -0 grep --exclude "*migrations*" foo
I'm just wondering if I'm missing something about the combination of command line parameters to grep or if they simply don't work together.

I was looking for a term on a .py file, but didn't want migration files to be scanned, so what I found (for grep 2.10) was the following (I hope this helps):
grep -nR --include="*.py" --exclude-dir=migrations whatever_you_are_looking_for .

man grep says:
--include=GLOB
Search only files whose base name matches GLOB (using wildcard matching as described under
--exclude).
because it says "only" there, i'm guessing that your --include statment is overriding your --exclude statement.

Related

Combine grep -v with grep -r?

I want to remove an entire line of text from all files in a given directory. I know I can use grep -v foo filename to do this one file at a time. And I know I can use grep -r foo to search recursively through a directory. How do I combine these commands to remove a given line of text from all files in a directory?
The UNIX command to find files is named find, not grep. Forget you ever heard of grep -r as it's just a bad idea, here's the right way to find files and perform some action on them:
find . -type f -print | xargs sed -i '/badline/d'
Try something like:
grep -vlre 'foo' . | xargs sed -i 's/pattern/replacement/g'
Broken down:
grep:
-v 'Inverse match'
-l 'Show filename'
-r 'Search recursively'
-e 'Extended pattern search'
xargs: For each entry perform
sed -i: replace inline
I think this would work:
grep -ilre 'Foo' . | xargs sed -i 'extension' 'Foo/d'
Where 'extension' refers to the addition to the file name. It will make a copy of the original file with the extension you designated and the modified file will have the original filename. I added -i in case you require it to be case insensitive.
modified file1 becomes "file1"
original file1 becomes "file1extension"
invalid command code ., despite escaping periods, using sed
One of the responses suggests that the newer version of sed's -i option in OSX is slightly different so you need to add an extension. The file is being interpreted as a command, which is why you are seeing that error.

Grep for multiple patterns over multiple files

I've been googling around, and I can't find the answer I'm looking for.
Say I have a file, text1.txt, in directory mydir whose contents are:
one
two
and another called text2.txt, also in mydir, whose contents are:
two
three
four
I'm trying to get a list of files (for a given directory) which contain all (not any) patterns I search for. In the example I provided, I'm looking for output somewhere along the lines of:
./text1.txt
or
./text1.txt:one
./text1.txt:two
The only things I've been able to find are concerning matching any patterns in a file, or matching multiple patterns in a single file (which I tried extending to a whole directory, but received grep usage errors).
Any help is much appreciated.
Edit-Things I've tried
grep "pattern1" < ./* | grep "pattern2" ./*
"ambiguous redirect"
grep 'pattern1'|'pattern2' ./*
returns files that match either pattern
One way could be like this:
find . | xargs grep 'pattern1' -sl | xargs grep 'pattern2' -sl
I think this is what you need (you can add easily more patterns)
grep -EH 'pattern1|pattern2' mydir
To refine brain's answer:
find . -type f -print0 | xargs -0 grep 'pattern1' -slZ | xargs -0 grep 'pattern2' -sl
This will keep grep from trying to search directories, and can properly handle filenames with spaces, if you pass the -Z flag to grep for all but the last pattern and pass -0 to xargs.

use grep to return a list of files, given multiple keywords (like google returns a list of webpages)

I need to find ALL files that have multiple keywords anywhere in the file (not necessarily on the same line), given a starting directory like ~/. Does "grep -ro" do this?
(I'm using Unix, Mac OSX 10.4)
You can use the -l option to get a list of filenames with matches, so it's just a matter of finding all of the files that have the first keyword and then filtering that list down to the files that also have the second keyword:
grep -rl first_keyword basedir | xargs grep -l second_keyword
To search just *.txt
find ~/. -name "*.txt" | xargs grep -l first_keyword | xargs grep -l second_keyword
Thanks Adam!

Automatically ignore files in grep

Is there any way I could use grep to ignore some files when searching something, something equivalent to svnignore or gitignore? I usually use something like this when searching source code.
grep -r something * | grep -v ignore_file1 | grep -v ignore_file2
Even if I could set up an alias to grep to ignore these files would be good.
--exclude option on grep will also work:
grep perl * --exclude=try* --exclude=tk*
This searches for perl in files in the current directory excluding files beginning with try or tk.
You might also want to take a look at ack which, among many other features, by default does not search VCS directories like .svn and .git.
find . -path ./ignore -prune -o -exec grep -r something {} \;
What that does is find all files in your current directory excluding the directory (or file) named "ignore", then executes the command grep -r something on each file found in the non-ignored files.
Use shell expansion
shopt -s extglob
for file in !(file1_ignore|file2_ignore)
do
grep ..... "$file"
done
I thinks grep does not have filename filtering.
To accomplish what you are trying to do, you can combine find, xargs, and grep commands.
My memory is not good, so the example might not work:
find -name "foo" | xargs grep "pattern"
Find is flexible, you can use wildcards, ignore case, or use regular expressions.
You may want to read manual pages for full description.
after reading next post, apparently grep does have filename filtering.
Here's a minimalistic version of .gitignore. Requires standard utils: awk, sed (because my awk is so lame), egrep:
cat > ~/bin/grepignore #or anywhere you like in your $PATH
egrep -v "`awk '1' ORS=\| .grepignore | sed -e 's/|$//g' ; echo`"
^D
chmod 755 ~/bin/grepignore
cat >> ./.grepignore #above set to look in cwd
ignorefile_1
...
^D
grep -r something * | grepignore
grepignore builds a simple alternation clause:
egrep -v ignorefile_one|ignorefile_two
not incredibly efficient, but good for manual use

Using grep and xargs, avoiding 'unterminated quote' error

I'm getting frustrated enough that I figured it was time to ask a question.
I'm trying to replace an email address across a website that is hard coded into 1000's of pages. It's on a FreeBSD 6.3 server.
Here is the command I am using:
grep -R --files-with-matches 'Email\#domain.com' . | sort | uniq | xargs perl -pi -e 's/Email\#domain.com/Email\#newdomain.com/' *.html
And here is the error that I keep getting:
xargs: unterminated quote
Oddly enough, when I run that command on a test case of 3 files (in a nested structure) it works just fine. I've been googling and most solutions seem to deal with adding a -print0 after the . and a -0 after the xargs. However, this yields a different set of errors that lead me to believe I'm putting things in the wrong places.
thanks in advance for your help
Pax is correct. I would further correct it to something like:
grep -R --files-with-matches 'Email\#domain.com' . -print0 | xargs -0 perl -pi -e 's/Email\#domain.com/Email\#newdomain.com/'
EDIT:
Thanks to kcwu, this is the full FreeBSD:
grep -R --files-with-matches 'Email\#domain.com' . --null | xargs -0 perl -pi -e 's/Email\#domain.com/Email\#newdomain.com/'
Note that I've removed sort and uniq. --files-without-match is documented to "stop on the first match" so you will not get duplicate files. -print0 and -0 ensure (and handle) a null-terminated file list, which is vital, because POSIX allows filenames to contain newlines.
Note that I don't know perl, but I'm assuming that part's roughly equivalent to:
sed -i s/Email\#domain.com/Email\#newdomain.com/g
Why are you giving a list of HTML files to xargs? That program takes its file list from the pipeline (output of grep).
Use GNU Parallel:
grep -R --files-with-matches 'Email\#domain.com' . | sort | uniq | parallel -q perl -pi -e 's/Email\#domain.com/Email\#newdomain.com/g'
Watch the intro video to learn more: http://www.youtube.com/watch?v=OpaiGYxkSuQ

Resources