To understand recursive grep in xargs - grep

What is the practical difference between the following two commands?
Command A
find . -type f -print0 | xargs -0 grep -r masi
Command B
find . -type f -print0 | xargs -0 grep masi
In short, what is the practical benefit of Command A?

None .. -r is for recursively searching directories, but the -type f will prevent find from returning directory names.

I think none
The A will try to recurse over file names (as the find is only searching for files) so it will not recurse into anything...

Related

How to search contents of Linux files modified between certain dates

I want to search a directory (excluding paths that contain any of certain words, ideally a regex pattern) and find all files with contents that match my query (ideally a regex pattern, which I'd make case-insensitive) and were modified between 2 specific dates.
Based on this answer, my current command is:
find /mnt/c/code -type f -mtime -100 -mtime +5 -print0 |
xargs -0 grep -l -v "firstUnwantedTerm" 'mySearchTerm'
Apparently this query does not exclude all paths that contain "firstUnwantedTerm".
Also, I'd love if the results could be sorted by modified datetime descending, displaying: their modified time, the full file name, and the search query (maybe in a different color in the console) surrounded by some context where it was seen.
grep -rnwl --exclude='*firstUnwantedTerm*' '/mnt/c/code' -e "mySearchTerm" from here also seemed to be a step in the right direction in the sense that it seems to correctly exclude my exclusion term, but it doesn't filter by modified datetime and doesn't output all the desired fields, of course.
This is just quick & dirty and without sorting by date, but with 3 lines of context before/after each match and coloured matches:
find ~/mnt/c/code -type f -mtime -100 -mtime +5 | grep -v 'someUnwantedPath' | xargs -I '{}' sh -c "ls -l '{}' && grep --color -C 3 -h 'mySearchTerm' '{}'"
Broken down into pieces with some explanation:
# Find regular files between 100 and 5 days old (modification time)
find ~/mnt/c/code -type f -mtime -100 -mtime +5 |
# Remove unwanted files from list
grep -v 'someUnwantedPath' |
# List each file, then find search term in each file,
# highlighting matches and
# showing 3 lines of context above and below each match
xargs -I '{}' sh -c "ls -l '{}' && grep --color -C 3 -h 'mySearchTerm' '{}'"
I think you can take it from here. Of course this can be made more beautiful and fulfill all your requirements, but I just had a couple of minutes and leave it to the UNIX gurus to beat me and make this whole thing 200% better.
Update: version 2 without xargs and with only one grep command:
find ~/mnt/c/code -type f -mtime -30 -mtime +25 ! -path '*someUnwantedPath*' -exec stat -c "%y %s %n" {} \; -exec grep --color -C 3 -h 'mySearchTerm' {} \;
! -path '*someUnwantedPath*' filters out unwanted paths, and the two -exec subcommands list candidate files and then show the grep results (which could also be empty), just like before. Please note that I changed from using ls -l to stat -c "%y %s %n" in order to list file date, size and name (just modify as you wish).
Again, with additional line breaks for readability:
find ~/mnt/c/code
-type f
-mtime -30 -mtime +25
! -path '*someUnwantedPath*'
-exec stat -c "%y %s %n" {} \;
-exec grep --color -C 3 -h 'mySearchTerm' {} \;

tar files after reducing list via find and grep?

I have worked out this command to give me the list of files I want to send to tar, how do I send this list to tar?
find . -not -type l | grep -E "(^\.\/bin\/custom|^\.\/config\/local)" | grep -v -E "(.settings|.classpath|.external)"
I want to preserver the hierarchy of bin/custom and config/local*
I don't want any other files (which there are a LOT of), the bin/custom is a directory and config/local* are files in config
I don't want any symbolic links
I want to exclude some of the hidden files (.settings|.classpath|.external)
You can use construction like this:
tar cvf tarfile.tar $(find . -type f | grep -E "(^\.\/bin\/custom|^\.\/config\/local)" | grep -v -E "(.settings|.classpath|.external)")
You just provide the list of files to be added in to the tar archive.
And its not need to use -not -type l, -type f will provide only files (and not links)
In case of many file something like can resolve the issue:
find . -type f | grep -E "(^\.\/bin\/custom|^\.\/config\/local)" | grep -v -E "(.settings|.classpath|.external)"|xargs tar cvf tarfile.tar

How to grep for filenames found by find in other files?

How can I grep for the result of find within another pattern?
That's how I get all filenames with a certain pattern (in my case ending with "ext1")
find . -name *ext1 -printf "%f\n"
And then I want to grep for these filenames with another pattern (in my case ending on "ext2"):
grep -r '[filname]' *ext2
I tried with
find . -name *ext1 -printf "%f\n" | xargs grep -r *ext2
But this only makes grep tell me that it can not find the files found by find.
You would tell grep that the patterns are in a file with the -f option, and use the "stdin filename" -:
find ... | grep -r -f - *ext2

Delete a list of files with find and grep

I want to delete all files which have names containing a specific word, e.g. "car".
So far, I came up with this:
find|grep car
How do I pass the output to rm?
find . -name '*car*' -exec rm -f {} \;
or pass the output of your pipeline to xargs:
find | grep car | xargs rm -f
Note that these are very blunt tools, and you are likely to remove files that you did not intend to remove. Also, no effort is made here to deal with files that contain characters such as whitespace (including newlines) or leading dashes. Be warned.
To view what you are going to delete first, since rm -fr is such a dangerous command:
find /path/to/file/ | grep car | xargs ls -lh
Then if the results are what you want, run the real command by removing the ls -lh, replacing it with rm -fr
find /path/to/file/ | grep car | xargs rm -fr
I like to use
rm -rf $(find . | grep car)
It does exactly what you ask, logically running rm -rf on the what grep car returns from the output of find . which is a list of every file and folder recursively.
You can use ls and grep to find your files and rm -rf to delete the files.
rm -rf $(ls | grep car)
But this is not a good idea to use this command if there is a chance of directories or files, you don't want to delete, having names with the character pattern you are specifying with grep.
You really want to use find with -print0 and rm with --:
find [dir] [options] -print0 | grep --null-data [pattern] | xargs -0 rm --
A concrete example (removing all files below the current directory containing car in their filename):
find . -print0 | grep --null-data car | xargs -0 rm --
Why is this necessary:
-print0, --null-data and -0 change the handling of the input/output from parsed as tokens separated by whitespace to parsed as tokens separated by the \0-character. This allows the handling of unusual filenames (see man find for details)
rm -- makes sure to actually remove files starting with a - instead of treating them as parameters to rm. In case there is a file called -rf and do find . -print0 | grep --null-data r | xargs -0 rm, the file -rf will possibly not be removed, but alter the behaviour of rm on the other files.
This finds a file with matching pattern (*.xml) and greps its contents for matching string (exclude="1") and deletes that file if a match is found.
find . -type f -name "*.xml" -exec grep exclude=\"1\" {} \; -exec rm {} \;
Most of the other solutions presented here have problems with handling file names with spaces in them. Here's a solution that handles spaces properly.
grep -lRZ car . | xargs -0 rm
Notes on arguments used:
-l tells grep to print only filenames
-R enables grep recursive search in subfolders
-Z tells grep to separate results by \0 instead of \n
-0 tells xargs to separate input arguments by \0 instead of whitespace
car is the regular expression to search for
. is the folder where to search
Can also use rm -f to force the removal (as usual).
A bit of necromancy, but you can also use find, grep, and xargs
find . -type f | grep -e "pattern1" -e "pattern2" | xargs rm -rf
^ Find will need some attention to make it work for your needs potentially, such as is a file, mindepth, maxdepth and any globbing.
when find | grep car | xargs rm -f get results:
/path/to/car
/path/to/car copy
some files which contain whitespace will not be removed.
So my answer is:
find | grep car | while read -r line ; do
rm -rf "${line}"
done
So the file contains whitespace could be removed.
find start_dir -iname \*car\* -exec rm -v {} \;
I use:
find . | grep "car" | while read i; do echo $i; rm -f "$i"; done
This works even if there are spaces in the filename as well as in recursive manner, searching for directories as well.
Use rm with wildcard *
rm * will delete all files
rm *.ext will delete all files which have ext as extension
rm word* will delete all files which starts with word.

find grep exclude some file names for dos2unix

so far I have gotten this far:
prompt$ find path/to/project -type f | grep -v '*.ori|*.pte|*.uh|*.mna' | xargs dos2unix 2> log.txt
However, the files with extensions .ori, .pte, .uh and .mna still show up.
It is better to leave the excluding to find, see Birei's answer.
The problem with your grep pattern is that you have specified it as a shell glob. By default grep expects basic regular expressions (BRE) as its first argument. So if you replace your grep pattern with: .*\.\(ori\|pte\|uh\|mna\)$ it should work. Or if you would rather use extended regular expressions (ERE), you can enable them with -E. Then you can express the same exclusion like this: .*\.(ori|pte|uh|mna)$.
Full command-line:
find . -type f | grep -vE '.*\.(ori|pte|uh|mna)$'
One way:
find path/to/project *.* -type f ! \( -name '*.ori' -o -name '*.pte' -o -name '*.uh' -o -name '*.mna' \)
| xargs dos2unix 2> log.txt

Resources