get filenames with paths that match a basename pattern - grep

I have a list of filenames with their paths like this
/some/path/or/another/RCrandomname.TRI
/another/path/NCrandomname2.TRI
/one/more/path/RCrandomname3.PCD
I would like to pick only the filenames (with their paths) whose basename starts with RC and have extension TRI
so in the above example I would like to get just
/some/path/or/another/RCrandomname.TRI
if I had the basename only I could do
ls | grep "^RC.*\.TRI$" > filenames
but here I have all the paths.
I have a requirement of using ls

Don't parse output of ls, it is error prone in many ways.
You can use globbing with shopt globstar:
shopt -s globstar
ls -1 **/RC*.TRI > filenames
globstar, When enabled, the globbing code treats ** specially -- it matches all directories (and files within them, when appropriate) recursively.

Related

Find the count of a specific keyword in multiple files in a directory

Say I have a directory /home/ and within it I have 3 subdirectories /home/red/ /home/blue/ /home/green/
And each subdirectory contains a file each like
/home/red/file1 /home/blue/file2 /home/green/file3
Now I want to find how many times file1,file2, file3 contains the word "hello" within them.
For example,
/home/red/file1 - 23
/home/blue/file2 - 6
/home/green/file3 - 0
Now, going to the locations of file and running the grep command is actually very inefficient when this problem scales.
I have tried using this grep command from the /home/ directory
grep -rnw '/path/to/somewhere/' -e 'pattern'
But this is just giving the occurrences rather than the count.
Is there any command through which I can get what I am looking for?
If the search term occurs at maximum once per line, you can use grep's -c option to report the count instead of the matching lines. So, the command will be grep -rc 'search' (add other options as needed).
If there can be more than one occurrence per line, I'd recommend using ripgrep. Note that rg recursively searches by default, so you can use something like rg -co 'search' from within the home directory (add other options as needed). Add --hidden if you need to search hidden files as well. Add --include-zero if you want to show files even if they didn't have any match.
Instead of grep you can use this find | gnu-awk solution:
cd /home
find {red/file1,blue/file2,green/file3} -type f -exec awk '
{c += gsub(/pattern/, "&")} ENDFILE {print FILENAME, "-", c; c=0}' {} +

grep: Find all files containing the word `star`, but not the word `start`

I have a bunch of files: some contain the word star, some contain the word start, some contain both.
I'd like to grep for files that contain the word star, but not the word start.
How can this be accomplished using only grep?
grep has some options for inverting the matches at the line or file level. You want the latter option, with the -L switch. The following will print the names of all the files in a folder that don't contain the text start:
grep -LF start *
-F tells grep that start is a literal string and not a regex. It's optional here, but might speed things up a tiny bit.
You can use the resulting list to search for files that contain star:
grep -lF star $(grep -LF start *)
-l prints only the names of files containing a match, not any line-by-line or match-by-match details. If this is not exactly what you want, man grep is your friend.
This uses an additional shell construct to run the inverted match, but it technically doesn't call any additional programs that aren't grep.
Update
Since you mention wanting to look through all the files starting with a given root folder, change -LF to -LFr. Replace * with your root folder if you don't want to change working directories.
-r tells grep to recurse into directories, and search every file it finds along the way.
With GNU grep for -w:
$ cat file
foo star bar
oof start rab
$ grep -w star *
foo star bar
or if you just want the names of the files containing star:
$ grep -lw star *
file
and to just find files to look in:
$ find . -maxdepth 1 -type f -exec grep -w 'star' {} \;
foo star bar

Grep --exclude-dir (root directory only)

I'm trying to setup a grep command, that searches my current directory, but excludes a directory, only if it's the root directory.
So for the following directories, I want #1 to be excluded, and #2 to be included
1) vendor/phpunit
2) app/views/vendor
I originally started with the below command
grep -Ir --exclude-dir=vendor keywords *
I tried using ^vendor, ^vendor/, ^vendor/, ^vendor, but nothing seems to work.
Is there a way to do this with grep? I was looking to try to do it with one grep call, but if I have to, I can pipe the results to a second grep.
With pipes:
grep -Ir keywords * | grep -v '^vendor/'
The problem with exclude-dir is, it tests the name of the directory and not the path before going into it, so it is not possible to distinguish between two vendor directories based on their depths.
Here is a better solution, which will actually ignore the specified directory:
function grepex(){
excludedir="$1"
shift;
for i in *; do
if [ "$i" != "$excludedir" ]; then
grep $# "$i"
fi
done
}
You use it as a drop-in replacement to grep, just have the excluded dir as the first argument and leave the * off the end. So, your command would look like:
grepex vendor -Ir keywords
It's not perfect, but as long as you don't have any really weird folders (e.g. with names like -- or something), it will cover most use cases. Feel free to refine it if you want something more elaborate.

How to use grep to search only in a specific file types?

I have a lot of files and I want to find where is MYVAR.
I'm sure it's in one of .yml files but I can't find in the grep manual how to specify the filetype.
grep -rn --include=*.yml "MYVAR" your_directory
please note that grep is case sensitive by default (pass -i to tell to ignore case), and accepts Regular Expressions as well as strings.
You don't give grep a filetype, just a list of files. Your shell can expand a pattern to give grep the correct list of files, though:
$ grep MYVAR *.yml
If your .yml files aren't all in one directory, it may be easier to up the ante and use find:
$ find -name '*.yml' -exec grep MYVAR {} \+
This will find, from the current directory and recursively deeper, any files ending with .yml. It then substitutes that list of files into the pair of braces {}. The trailing \+ is just a special find delimiter to say the -exec switch has finished. The result is matching a list of files and handing them to grep.
If all your .yml files are in one directory, then cd to that directory, and then ...
grep MYWAR *.yml
If all your .yml files are in multiple directories, then cd to the top of those directories, and then ...
grep MYWAR `find . -name \*.yml`
If you don't know the top of those directories where your .yml files are located and want to search the whole system ...
grep MYWAR `find / -name \*.yml`
The last option may require root privileges to read through all directories.
The ` character above is the one that is located along with the ~ key on the keyboard.
find . -name \*.yml -exec grep -Hn MYVAR {} \;

grep in all directories

I have a directory named XYZ which has directories ABC, DEF, GHI inside it. I want to search for a pattern 'writeText' in all *.c in all directories (i.e XYZ, XYZ/ABC, XYZ/DEF and XYZ/GHI)
What grep command can I use?
Also if I want to search only in XYZ, XYZ/ABC, XYZ/GHI and not XYZ/DEF, what grep command can I use?
Thank you!
grep -R --include="*.c" --exclude-dir={DEF} writeFile /path/to/XYZ
-R means recursive, so it will go into subdirectories of the directory you're grepping through
--include="*.c" means "look for files ending in .c"
--exclude-dir={DEF} means "exclude directories named DEF. If you want to exclude multiple directories, do this: --exclude-dir={DEF,GBA,XYZ}
writeFile is the pattern you're grepping for
/path/to/XYZ is the path to the directory you want to grep through.
Note that these flags apply to GNU grep, might be different if you're using BSD/SysV/AIX grep. If you're using Linux/GNU grep utils you should be fine.
You can use the following command to answer at least the first part of your question.
find . -name *.c | xargs grep "writeText"

Resources