My script is not matching exact words only. Example: 12312312Alachua21321 or Alachuas would match for Alachua.
KEYWORDS=("Alachua" "Gainesville" "Hawthorne")
IFS=$'\n'
find . -size +1c -type f ! -exec grep -qF "${KEYWORDS[*]}" {} \; -exec truncate -s 0 {} \;
If you want grep to match exact words, use grep -w.
You may also want to read the grep manual by running man grep.
Related
I want to search a directory (excluding paths that contain any of certain words, ideally a regex pattern) and find all files with contents that match my query (ideally a regex pattern, which I'd make case-insensitive) and were modified between 2 specific dates.
Based on this answer, my current command is:
find /mnt/c/code -type f -mtime -100 -mtime +5 -print0 |
xargs -0 grep -l -v "firstUnwantedTerm" 'mySearchTerm'
Apparently this query does not exclude all paths that contain "firstUnwantedTerm".
Also, I'd love if the results could be sorted by modified datetime descending, displaying: their modified time, the full file name, and the search query (maybe in a different color in the console) surrounded by some context where it was seen.
grep -rnwl --exclude='*firstUnwantedTerm*' '/mnt/c/code' -e "mySearchTerm" from here also seemed to be a step in the right direction in the sense that it seems to correctly exclude my exclusion term, but it doesn't filter by modified datetime and doesn't output all the desired fields, of course.
This is just quick & dirty and without sorting by date, but with 3 lines of context before/after each match and coloured matches:
find ~/mnt/c/code -type f -mtime -100 -mtime +5 | grep -v 'someUnwantedPath' | xargs -I '{}' sh -c "ls -l '{}' && grep --color -C 3 -h 'mySearchTerm' '{}'"
Broken down into pieces with some explanation:
# Find regular files between 100 and 5 days old (modification time)
find ~/mnt/c/code -type f -mtime -100 -mtime +5 |
# Remove unwanted files from list
grep -v 'someUnwantedPath' |
# List each file, then find search term in each file,
# highlighting matches and
# showing 3 lines of context above and below each match
xargs -I '{}' sh -c "ls -l '{}' && grep --color -C 3 -h 'mySearchTerm' '{}'"
I think you can take it from here. Of course this can be made more beautiful and fulfill all your requirements, but I just had a couple of minutes and leave it to the UNIX gurus to beat me and make this whole thing 200% better.
Update: version 2 without xargs and with only one grep command:
find ~/mnt/c/code -type f -mtime -30 -mtime +25 ! -path '*someUnwantedPath*' -exec stat -c "%y %s %n" {} \; -exec grep --color -C 3 -h 'mySearchTerm' {} \;
! -path '*someUnwantedPath*' filters out unwanted paths, and the two -exec subcommands list candidate files and then show the grep results (which could also be empty), just like before. Please note that I changed from using ls -l to stat -c "%y %s %n" in order to list file date, size and name (just modify as you wish).
Again, with additional line breaks for readability:
find ~/mnt/c/code
-type f
-mtime -30 -mtime +25
! -path '*someUnwantedPath*'
-exec stat -c "%y %s %n" {} \;
-exec grep --color -C 3 -h 'mySearchTerm' {} \;
I am trying to find all instances of a string within files - I am using find and it works great, however, it returns not only the file but every instance of that string within the file which results in a huge long list whereas I really only want the file name.
I am using:
find . -name '*.php' -exec grep -i 'MATCH' {} \; -print
This will show me every instance of MATCH and then the file name then the next batch and the filename so something like:
MATCH
MATCH
MATCH
./filename
MATCH
MATCH
MATCH
./filename2
I tried changing GREP to:
find . -name '*.php' -exec grep -H -i 'MATCH' {} \; -print
and this then gave me:
./filename: MATCH
./filename: MATCH
./filename: MATCH
./filename2: MATCH
./filename2: MATCH
./filename2: MATCH
however this still results in the same number of lines being shown all be it slightly differently laid out.
I tried changing GREP to:
find . -name '*.php' -exec grep -l -i 'MATCH' {} \; -print
and this then gave me:
./filename
./filename
./filename
./filename2
./filename2
./filename2
Ideally I would like something like:
./filename
./filename2
which only lists each of the file which match once regardless of how many times it appears in each file - can this be done?
Use the features provided by grep:
-l, --files-with-matches print only names of FILEs containing matches
-R, -r, --recursive equivalent to --directories=recurse
--include=FILE_PATTERN search only files that match FILE_PATTERN
-i, --ignore-case ignore case distinctions
I.e.
grep -ril --include="*.php" 'MATCH' .
It's not the grep that's the problem, you're telling find to print every file name:
find . -name '*.php' -exec grep -l -i 'MATCH' {} \; -print
Note the -print at the end. Just remove that:
find . -name '*.php' -exec grep -l -i 'MATCH' {} \;
Look:
$ echo 'foo' > file1
$ echo 'foo' > file2
$ find . -name 'file*' -exec grep -l -i foo {} \; -print
./file1
./file1
./file2
./file2
$ find . -name 'file*' -exec grep -l -i foo {} \;
./file1
./file2
I want to grep -R a directory but exclude symlinks how dow I do it?
Maybe something like grep -R --no-symlinks or something?
Thank you.
Gnu grep v2.11-8 and on if invoked with -r excludes symlinks not specified on the command line and includes them when invoked with -R.
If you already know the name(s) of the symlinks you want to exclude:
grep -r --exclude-dir=LINK1 --exclude-dir=LINK2 PATTERN .
If the name(s) of the symlinks vary, maybe exclude symlinks with a find command first, and then grep the files that this outputs:
find . -type f -a -exec grep -H PATTERN '{}' \;
The '-H' to grep adds the filename to the output (which is the default if grep is searching recursively, but is not here, where grep is being handed individual file names.)
I commonly want to modify grep to exclude source control directories. That is most efficiently done by the initial find command:
find . -name .git -prune -o -type f -a -exec grep -H PATTERN '{}' \;
For now.. here is how I would exclude symbolic links when using grep
If you want just file names matching your search:
for f in $(grep -Rl 'search' *); do if [ ! -h "$f" ]; then echo "$f"; fi; done;
Explaination:
grep -R # recursive
grep -l # file names only
if [ ! -h "file" ] # bash if not a symbolic link
If you want the matched content output, how about a double grep:
srch="whatever"; for f in $(grep -Rl "$srch" *); do if [ ! -h "$f" ]; then
echo -e "\n## $f";
grep -n "$srch" "$f";
fi; done;
Explaination:
echo -e # enable interpretation of backslash escapes
grep -n # adds line numbers to output
.. It's not perfect of course. But it could get the job done!
If you're using an older grep that does not have the -r behavior described in Aryeh Leib Taurog's answer, you can use a combination of find, xargs and grep:
find . -type f | xargs grep "text-to-search-for"
If you are using BSD grep (Mac) the following works similar to '-r' option of Gnu grep.
grep -OR <PATTERN> <PATH> 2> /dev/null
From man page
-O If -R is specified, follow symbolic links only if they were explicitly listed on the command line.
so far I have gotten this far:
prompt$ find path/to/project -type f | grep -v '*.ori|*.pte|*.uh|*.mna' | xargs dos2unix 2> log.txt
However, the files with extensions .ori, .pte, .uh and .mna still show up.
It is better to leave the excluding to find, see Birei's answer.
The problem with your grep pattern is that you have specified it as a shell glob. By default grep expects basic regular expressions (BRE) as its first argument. So if you replace your grep pattern with: .*\.\(ori\|pte\|uh\|mna\)$ it should work. Or if you would rather use extended regular expressions (ERE), you can enable them with -E. Then you can express the same exclusion like this: .*\.(ori|pte|uh|mna)$.
Full command-line:
find . -type f | grep -vE '.*\.(ori|pte|uh|mna)$'
One way:
find path/to/project *.* -type f ! \( -name '*.ori' -o -name '*.pte' -o -name '*.uh' -o -name '*.mna' \)
| xargs dos2unix 2> log.txt
I'm trying to grep multiple extensions within the current and all sub-folders.
grep -i -r -n 'hello' somepath/*.{php,html}
This is only grepping the current folder but not sub-folders.
What would be a good way of doing this?
Using only grep:
grep -irn --include='*.php' --include='*.html' 'hello' somepath/
One of these:
find '(' -name '*.php' -o -name '*.html' ')' -exec grep -i -n hello {} +
find '(' -name '*.php' -o -name '*.html' ')' -print0 | xargs -0 grep -i -n hello
I was looking the same and when decided to do a bash script I started with vim codesearch and surprise I already did this before!
#!/bin/bash
context="$3"
#ln = line number mt = match mc = file
export GREP_COLORS="sl=32:mc=00;33:ms=05;40;31:ln="
if [[ "$context" == "" ]]; then context=5; fi
grep --color=always -n -a -R -i -C"$context" --exclude='*.mp*'\
--exclude='*.avi'\
--exclude='*.flv'\
--exclude='*.png'\
--exclude='*.gif'\
--exclude='*.jpg'\
--exclude='*.wav'\
--exclude='*.rar'\
--exclude='*.zip'\
--exclude='*.gz'\
--exclude='*.sql' "$2" "$1" | less -R
paste this code into in a file named codesearch and set the chmod to 700 or 770
I guess this could be better here for the next time that I forgot
this script will show with colors the matches and the context around
./codesearch '/full/path' 'string to search'
and optional defining the number of context line around default 5
./codesearch '/full/path' 'string to search' 3
I edited the code and added some eye candy
example ./codesearch ./ 'eval' 2
Looks like this when you have enabled "allow blinking text" in terminal