Find files starting with NULs - grep

How do I efficiently find all the files in the system whose contents starts with \x0000000000 (5 NUL bytes)?
Tried to do the following
$ find . -type f -exec grep -m 1 -ovP "[^\x00]" {}
$ find . -type f -exec grep -m 1 -vP "^\00{5}" {}
but the first variant works only for all-NUL files, and the last one searches through the whole file, not only the first 5 bytes, which makes it very slow and gives many false positives.

Try this :
grep -r '^\\x0000000000' * | cut -d ":" -f 1

Related

How to search contents of Linux files modified between certain dates

I want to search a directory (excluding paths that contain any of certain words, ideally a regex pattern) and find all files with contents that match my query (ideally a regex pattern, which I'd make case-insensitive) and were modified between 2 specific dates.
Based on this answer, my current command is:
find /mnt/c/code -type f -mtime -100 -mtime +5 -print0 |
xargs -0 grep -l -v "firstUnwantedTerm" 'mySearchTerm'
Apparently this query does not exclude all paths that contain "firstUnwantedTerm".
Also, I'd love if the results could be sorted by modified datetime descending, displaying: their modified time, the full file name, and the search query (maybe in a different color in the console) surrounded by some context where it was seen.
grep -rnwl --exclude='*firstUnwantedTerm*' '/mnt/c/code' -e "mySearchTerm" from here also seemed to be a step in the right direction in the sense that it seems to correctly exclude my exclusion term, but it doesn't filter by modified datetime and doesn't output all the desired fields, of course.
This is just quick & dirty and without sorting by date, but with 3 lines of context before/after each match and coloured matches:
find ~/mnt/c/code -type f -mtime -100 -mtime +5 | grep -v 'someUnwantedPath' | xargs -I '{}' sh -c "ls -l '{}' && grep --color -C 3 -h 'mySearchTerm' '{}'"
Broken down into pieces with some explanation:
# Find regular files between 100 and 5 days old (modification time)
find ~/mnt/c/code -type f -mtime -100 -mtime +5 |
# Remove unwanted files from list
grep -v 'someUnwantedPath' |
# List each file, then find search term in each file,
# highlighting matches and
# showing 3 lines of context above and below each match
xargs -I '{}' sh -c "ls -l '{}' && grep --color -C 3 -h 'mySearchTerm' '{}'"
I think you can take it from here. Of course this can be made more beautiful and fulfill all your requirements, but I just had a couple of minutes and leave it to the UNIX gurus to beat me and make this whole thing 200% better.
Update: version 2 without xargs and with only one grep command:
find ~/mnt/c/code -type f -mtime -30 -mtime +25 ! -path '*someUnwantedPath*' -exec stat -c "%y %s %n" {} \; -exec grep --color -C 3 -h 'mySearchTerm' {} \;
! -path '*someUnwantedPath*' filters out unwanted paths, and the two -exec subcommands list candidate files and then show the grep results (which could also be empty), just like before. Please note that I changed from using ls -l to stat -c "%y %s %n" in order to list file date, size and name (just modify as you wish).
Again, with additional line breaks for readability:
find ~/mnt/c/code
-type f
-mtime -30 -mtime +25
! -path '*someUnwantedPath*'
-exec stat -c "%y %s %n" {} \;
-exec grep --color -C 3 -h 'mySearchTerm' {} \;

tar files after reducing list via find and grep?

I have worked out this command to give me the list of files I want to send to tar, how do I send this list to tar?
find . -not -type l | grep -E "(^\.\/bin\/custom|^\.\/config\/local)" | grep -v -E "(.settings|.classpath|.external)"
I want to preserver the hierarchy of bin/custom and config/local*
I don't want any other files (which there are a LOT of), the bin/custom is a directory and config/local* are files in config
I don't want any symbolic links
I want to exclude some of the hidden files (.settings|.classpath|.external)
You can use construction like this:
tar cvf tarfile.tar $(find . -type f | grep -E "(^\.\/bin\/custom|^\.\/config\/local)" | grep -v -E "(.settings|.classpath|.external)")
You just provide the list of files to be added in to the tar archive.
And its not need to use -not -type l, -type f will provide only files (and not links)
In case of many file something like can resolve the issue:
find . -type f | grep -E "(^\.\/bin\/custom|^\.\/config\/local)" | grep -v -E "(.settings|.classpath|.external)"|xargs tar cvf tarfile.tar

Using find recursively and only returning the matching file names

I am trying to find all instances of a string within files - I am using find and it works great, however, it returns not only the file but every instance of that string within the file which results in a huge long list whereas I really only want the file name.
I am using:
find . -name '*.php' -exec grep -i 'MATCH' {} \; -print
This will show me every instance of MATCH and then the file name then the next batch and the filename so something like:
MATCH
MATCH
MATCH
./filename
MATCH
MATCH
MATCH
./filename2
I tried changing GREP to:
find . -name '*.php' -exec grep -H -i 'MATCH' {} \; -print
and this then gave me:
./filename: MATCH
./filename: MATCH
./filename: MATCH
./filename2: MATCH
./filename2: MATCH
./filename2: MATCH
however this still results in the same number of lines being shown all be it slightly differently laid out.
I tried changing GREP to:
find . -name '*.php' -exec grep -l -i 'MATCH' {} \; -print
and this then gave me:
./filename
./filename
./filename
./filename2
./filename2
./filename2
Ideally I would like something like:
./filename
./filename2
which only lists each of the file which match once regardless of how many times it appears in each file - can this be done?
Use the features provided by grep:
-l, --files-with-matches print only names of FILEs containing matches
-R, -r, --recursive equivalent to --directories=recurse
--include=FILE_PATTERN search only files that match FILE_PATTERN
-i, --ignore-case ignore case distinctions
I.e.
grep -ril --include="*.php" 'MATCH' .
It's not the grep that's the problem, you're telling find to print every file name:
find . -name '*.php' -exec grep -l -i 'MATCH' {} \; -print
Note the -print at the end. Just remove that:
find . -name '*.php' -exec grep -l -i 'MATCH' {} \;
Look:
$ echo 'foo' > file1
$ echo 'foo' > file2
$ find . -name 'file*' -exec grep -l -i foo {} \; -print
./file1
./file1
./file2
./file2
$ find . -name 'file*' -exec grep -l -i foo {} \;
./file1
./file2

How to search for 2 key words from files in a directory and print their filename if it occurs more than once

I am trying to grep or find for 2 specific words in each file in a directory. And then If i find more than one file found with such a combination - only then I should print those file names to a CSV file.
Here is what I tried so far:
find /dir/test -type f -printf "%f\n" | xargs grep -r -l -e 'ABCD1' -e 'ABCD2' > log1.csv
But this will provide all file names that have "ABCD1" and "ABCD2". In other words, this command will print the filename even if there is only one file that has this combo.
I will need to grep the entire directory for those 2 words and both words MUST be in more than one file if it has to write the filenames to CSV. I should also be able to include sub directories
Any help would be great!
Thanks
find + GNU grep solution:
find . -type f -exec grep -qPz 'ABCD1[\s\S]*ABCD2|ABCD2[\s\S]*ABCD1' {} \; -printf "%f\n" \
| tee /tmp/flist | [[ $(wc -l) -gt 1 ]] && cat /tmp/flist > log1.csv
Alternative way:
grep -lr 'ABCD2' /dir/test/* | xargs grep -l 'ABCD1' | tee /tmp/flist \
| [[ $(wc -l) -gt 1 ]] && sed 's/.*\/\([^\/]*\)$/\1/' /tmp/flist > log1.csv

find + grep to match exact keywords in a file

My script is not matching exact words only. Example: 12312312Alachua21321 or Alachuas would match for Alachua.
KEYWORDS=("Alachua" "Gainesville" "Hawthorne")
IFS=$'\n'
find . -size +1c -type f ! -exec grep -qF "${KEYWORDS[*]}" {} \; -exec truncate -s 0 {} \;
If you want grep to match exact words, use grep -w.
You may also want to read the grep manual by running man grep.

Resources