grep recursive filename matching (grep -ir "xyz" *.cpp) does not work - grep

while
grep -ir "xyz" * recursively searches through the directories and tell me that the text is present in ./x/y/z/abc.cpp
However ,
grep -ir "xyz" *.cpp offers no result.
Isn't the second command supposed to recursively grep all cpp files inside the directory ?
What am I missing here?

Grep will recurse through any directories you match with your glob pattern. (In your case, you probably do not have any directories that match the pattern "*.cpp") You could explicitly specify them: grep -ir "xyz" *.cpp */*.cpp */*/*.cpp */*/*/*.cpp, etc. You can also use the --include option (see the example below)
If you are using GNU grep, then you can use the following:
grep -ir --include "*.cpp" "xyz" .
The command above says to search recursively starting in current directory ignoring case on the pattern and to only search in files that match the glob pattern "*.cpp".
OR if you are on some other Unix platform, you can use this:
find ./ -type f -name "*.cpp" -print0 | xargs -0 grep -i "xyz"
If you are sure that none of your files have spaces in their names, you can omit the -print0 argument to find and the -0 to xargs
The command above says the following: find all files (-type f) under the current directory (./) that match the name glob/wildcard "*.cpp" (-name "*.cpp") and then print them out delimited by a null (-print0). That list of files found should be written to the stdin of the next command: xargs. xargs should read from stdin (default behavior) and split its input on nulls (-0) and then call the grep command with the specified options (grep -i "xyz") on that list of files.
If you are interested in learning more about why grep -ir "xyz" *.cpp does not work the way you think it should, you should search for "shell globbing" (here is a good first article on the subject). I'll also try to provide a quick explanation. When you type in the command grep -ir "xyz" *.cpp and hit enter, there are two programs that are involved in executing your command. The first program is your shell (and unless you've done something to customize things, you are probably usually the bash shell - if you've never heard of a shell or bash, that's where you should start looking, there are tons of good articles). Suffice it say that a shell is just a program that is designed to let you navigate the filesystem on your computer and run other programs. (In Windows, when you double click on an icon to launch a program, or open a folder to access a file, the program that you are running is explorer.exe and it is the Windows graphical shell). So, when you type the command grep -ir "xyz" *.cpp, before grep is run, the shell handles reading your command and does a few things. One of the things is does is expand glob patterns (things like *.txt or [0-9]+.pdf). Like I said, if you want to understand it, go read more about it, but the thing you should take away is that the grep command never sees the *.cpp. What happens is, the shell looks in the current directory for any files or directories with a name that match the pattern *.cpp and then replaces them on the command line BEFORE it runs the grep command. (If it doesn't find anything that matches, then it will leave the *.cpp there and grep will see it, but grep because doesn't normally do glob matching, this doesn't do anything for you).
Alternatively, when you type in grep -ir "xyz" *, what happens is that the shell replaces the * with the name of every file and directory in the current directory (because * matches anything). Let's say you had a directory that contained file1, file2, and dir1, and dir2, then the shell would perform its replacements and then execute a command that looked like this grep -ir "xyz" file1 file2 dir1 dir2, which means grep would search file1 and file2 for a line with the string xyz, and because of the -ir it also search recursively through dir1 and dir2 and search any files found for that string as well. Lastly, if you've followed everything I've said so far, then it will make sense to you that grep does have a way to use glob patterns on recursive searches, and that is to use the --include option, as in the command I described earlier: grep -ir --include "*.cpp" "xyz" ., and the reason why we put the *.cpp in quotes in that command is to prevent the shell from trying to expand the glob pattern before we run the command.

Related

grep - difference between ways of specifying directory and globs

Say I'm in a project folder and want to grep a keyword using grep -rni. What's the difference between these 3 commands?
grep -rni . -e "keyword"
grep -rni * -e "keyword"
grep -rni **/* -e "keyword"
I tested this and noticed that the first two commands return the same number of matches, although in different ordering. The third one returned significantly more matches than the first two, however.
Is there any reason to use the third one ever? Is the reason it's returning more matches duplicates?
First of all, the difference has nothing to do with the arguments -n and -i.
From grep man page:
-n, --line-number
Prefix each line of output with the 1-based line number within its input file.
-i, --ignore-case
Ignore case distinctions in patterns and input data, so that characters that differ only in case match each other.
-r, --recursive
Read all files under each directory, recursively, following symbolic links only if they are on the command line. Note that if no file operand is given, grep searches the working directory. This is equivalent to the
-d recurse option.
So, the difference is actually on how the strings * and **/* are interpreted by the shell.
With . you pass the current directory as an argument to grep. No mystery here because it is grep the one who walks the current working directory.
With * you pass every file in the current directory as an argument to grep (this include directories).
Now, suppose you have the following directory structure:
├── file.txt
├── one
│   └── file.txt
└── two
└── file.txt
Running grep -rni * -e keyword is translated to:
grep -rni file.txt one two -e keyword
This conditions grep to iterate files and nested directories in that order.
Finally, grep -rni **/* -e keyword will translate to this command line:
grep -rni file.txt one one/file.txt two two/file.txt -e keyword
The problem with this last approach is that some files will be processed more than once. For instance: one/file.txt will be processed twice: once because it is explicitly in the argument list, and another time because it belongs to the directory one, which is also in the argument list.

Mingw64 shell's grep ignores -r option?

I'm trying to do a grep in Microsoft Windows, using the MINGW64 shell v4.4.23(1). (That's what the title bar says. I assume this means MingW-W64.)
I want to list all files in a specified directory tree that have a certain filename extension and do not contain a certain string.
With the current directory set to the top of the tree I entered
grep -r -L thestring *.theextension
It lists only files in the current directory, not the tree.
I tried some variations and determined that grep is simply ignoring the -r option. It ignores --recursive, too.
But when I enter grep --help, it lists both -r and --recursive as valid options, with the expected meaning.
Is this a bug in the shell, or am I doing something stupid?
With grep -r -L thestring *.theextension you are telling grep to search recursively in any file or folder matching *.theextension. If you don't have any folders matching that you shouldn't expect it to go through any other folders. The -L flag doesn't mean it's going to look at anything not matching *.theextension, maybe that's what was confusing you...

Find multiple files containing a certian string and list the files that don't containt the string

In the midst of building a site checker I have ran unto a problem, the client needs to check all of their pages for certain strings if they are included in the code and then list the files that do not have the code yet.
Tried with multiple grep commands with no success. The -v supposedly exports the inverted match of the results, but that does not happen. Currently I am missing the part of the code telling grep to only search in specific files (example files names code.php) in all sub folders.
With the current code it searches all the files even unnecessary ones.
grep -vrn '.' -e "SRING" > list.txt
I'd like to export a list of files (preferably that it only checks files in all sub folders with the same name) that do not posses the sting that I am looking for.
grep -c STRING files will give you a count of lines with STRING for each file.
You can optomize it somewhat with -m1 to stop after the first match.
You can pipe that to sed to grab the files with zero matches:
grep -cm1 STRING files | sed -n '/:0$/s/:0//p'
That gets you one file per line.
You can pipe that to xargs to merge it into a one-line list.
If your STRING is just that and not a regex, you could use the -F flag with grep to specify it's a fixed string, and that will also speed things up. So maybe...
grep -Fcm1 STRING files | sed -n '/:0$/s/:0//p' > list.txt
...in that case

Strange behavior grep -rnw

I am using grep (BSD grep) 2.5.1-FreeBSD in MacOS and I have found the following behavior.
I have two *.tex files. Each one of these contains the following lines
$k$-th bit of
$(i-m)$-th bit of
respectively. When I ran
grep --color -rnw . -e '\$-th bit of' --include="*.tex"
I got only the second file, i.e., $(i-m)$-th bit of, while I expect the two lines. Could you help me please to understand this behavior?
Never use -r or --include or any other grep option to find files. The GNU guys really screwed up by adding those options to grep when there's a perfectly good tool named find for finding files and now they've turned grep into a convoluted mush of finding files and Globally matching a Regular Expression within a file and Printing the result (G/RE/P).
Keep it simple - find the files with find then g/re/p within then using grep:
find . -name '*.tex' -exec grep --color -n '\$-th bit of' {} +
As others pointed out your g/re/p problem was the -w arg so I've removed that above.
I have the same version of grep.
It is caused by your use of the -w option:
-w, --word-regexp
The expression is searched for as a word (as if surrounded by `[[:<:]]' and `[[:>:]]'; see re_format(7)).
The matched part of the string $k$-th bit of is bounded on the left-hand side by a word character (i.e. k) so the match is treated as being inside a "word" and it can't therefore satisfy the "searched for as a whole word" requirement.
Try without -w and it will work fine.

grep alias search command not working

I am trying to make an alias called File Search (fs) for short. That takes one argument (search term). It then searches down the directory tree for that using grep.
Example:
fs 'function my_function()'
What am I doing wrong?
alias fs='grep -R "$1" .'
What you tried would search the current directory itself, not the files in it, and certainly not the files in its subdirectories. You want something like this (from memory, I'm not at a Unix machine right now):
find . -type f | xargs grep "$1"

Resources