Find number of strings in number of files - grep

#!/bin/ksh
inputpath="/home/beaadmin/SET4/Input.txt" #Give patterns to be searched
contentpath="/home/beaadmin/SET4/FILES"
outpath="/home/beaadmin/SET4/impacted"
count=1
while read line
do
echo "Line :$count"
echo "$line"
return=$(find $contentpath -iname "*" | xargs grep "$line*")
if [ $? -eq 0 ]
then
echo "$line" >> $outpath
else
echo ""
fi
let count=$count+1
done < $inputpath
let say I have string1,string2,string3 and File1,File2,File3..
I want to find string1 in File1,File2,File3 and if match found then write it to output dir.same way to find for string2,string3..But the above code not finding it

Assuming string1,string2,etc... are on separate lines of Input.txt, there's only a small bug.
Change this:
xargs grep "$line*"
To this:
xargs grep "$line.*"
Also, two suggestions about your find command. First, -iname "*" doesn't have any effect. Second, your find command will find some directories, which will cause grep some (probably non-fatal) errors. You could fix that with e.g.,
find -type f $contentpath

Related

show filename with matching word from grep only

I am trying to find which words happened in logfiles plus show the logfilename for anything that matches following pattern:
'BA10\|BA20\|BA21\|BA30\|BA31\|BA00'
so if file dummylogfile.log contains BA10002 I would like to get a result such as:
dummylogfile.log:BA10002
it is totally fine if the logfile shows up twice for duplicate matches.
the closest I got is:
for f in $(find . -name '*.err' -exec grep -l 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' {} \+);do printf $f;printf ':';grep -o 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' $f;done
but this gives things like:
./register-05-14-11-53-59_24154.err:BA10
BA10
./register_mdw_files_2020-05-14-11-54-32_24429.err:BA10
BA10
./process_tables.2020-05-18-11-18-09_11428.err:BA30
./status_load_2020-05-18-11-35-31_9185.err:BA30
so,
1) there are empty lines with only the second match and
2) the full match (e.g., BA10004) is not shown.
thanks for the help
There are a couple of options you can pass to grep:
-H: This will report the filename and the match
-o: only show the match, not the full line
-w: The match must represent a full word (string build from [A-Za-z0-9_])
If we look at your regex, you use BA01, this will match only BA01 which can appear anywhere in the text, also mid word. If you want the regex to match a full word, it should read BA01[[:alnum:]_]* which adds any sequence of word-constituent characters (equivalent to [A-Za-z0-9_]). You can test this with
$ echo "foo BA01234 barBA012" | grep -Ho "BA01"
(standard input):BA01
(standard input):BA01
$ echo "foo BA01234 barBA012" | grep -How "BA01"
$ echo "foo BA01234 barBA012" | grep -How "BA01[[:alnum:]_]*"
(standard input):BA01234
So your grep should look like
grep -How "\('BA10\|BA20\|BA21\|BA30\|BA31\|BA00'\)[[:alnum:]_]*" *.err
From your example it seems that all files are in one directory. So the following works right away:
grep -l 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' *.err
If the files are in different directories:
find . -name '*.err' -print | xargs -I {} grep 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' {} /dev/null
Explanation: the addition of /dev/null to the filename {} forces grep to report the matching filename

grep multiple files through tail

Trying to grep a phrase out of multiple files as they are constantly populated (logs), but with hint as to which file was updated with the phrase.
For example:
grep bindaddr /vservers/*/var/log
gets me:
/vservers/11010/var/log:bindaddr=xxx.xxx.xxx.xxx
/vservers/12525/var/log:bindaddr=xxx.xxx.xxx.xxx
/vservers/12593/var/log:bindaddr=xxx.xxx.xxx.xxx
Which is cool, but I need this for tail -f.
tail -fn 100 /vservers/*/var/log | grep bindaddr
gets me the lines needed but no indicator in which file, so I need a mix of the two.
If you use -v in tail, you get a verbose mode: from man tab --> "always output headers giving file names". This way, whenever something happens in a file, you will get the header on the preceding line.
Together with this, you can use grep -B1 to show the match + the previous line.
All together, this should do:
tail -fvn 100 /vservers/*/var/log | grep -B1 bindaddr
Test
Doing this in one tab:
$ echo "hi" >> a2
$ echo "hi" >> a2
$ echo "hi" >> a1
$ echo "hi" >> a2
I got this in the other one:
$ tail -vfn 100 /tmp/a* | grep -B1 "h"
==> /tmp/a1 <==
==> /tmp/a2 <==
hi
hi
==> /tmp/a1 <==
hi
==> /tmp/a2 <==
hi
Something like this to put the filename in the front of each line from tail:
#!/bin/bash
# Arrange to kill all descendants on exit/interrupt
trap "kill 0" SIGINT SIGTERM EXIT
for f in *.txt; do
tail -f "$f" | sed "s/^/"$f": /" > /dev/tty &
done
# grep in stdin (i.e. /dev/tty)
grep bina -
I think some people are coming to this post looking for a way to display the filename while grepping the tail of multiple files:
for f in path/to/files*.txt; do echo $f; tail $f | grep 'SEARCH-THIS'; done;
This will display an output like this
filename1.txt
search result 1
search result 2
filenam2.txt
search result 3
search result 4
...

Working on a script to recursively search folders for strings within files- but running into a white space issue

Was wondering what the options were to address folders with white space, since unix doesn't like it. I checked around for solutions and people pointed to print0, but it seems to be exclusive to the find command? Is there something like that for grep?
FOLDER=$1
STRING=$2
grep -lr $STRING $FOLDER | while read file; do
echo "Found String at " $file
echo "Lines-"
grep -n $STRING $file
echo
done
Your script should work OK as long as there are no filenames with newlines if you fix the second grep by adding double quotes:
grep -n "$STRING" "$file"

How can I have grep not print out 'No such file or directory' errors?

I'm grepping through a large pile of code managed by git, and whenever I do a grep, I see piles and piles of messages of the form:
> grep pattern * -R -n
whatever/.git/svn: No such file or directory
Is there any way I can make those lines go away?
You can use the -s or --no-messages flag to suppress errors.
-s, --no-messages suppress error messages
grep pattern * -s -R -n
If you are grepping through a git repository, I'd recommend you use git grep. You don't need to pass in -R or the path.
git grep pattern
That will show all matches from your current directory down.
Errors like that are usually sent to the "standard error" stream, which you can pipe to a file or just make disappear on most commands:
grep pattern * -R -n 2>/dev/null
I have seen that happening several times, with broken links (symlinks that point to files that do not exist), grep tries to search on the target file, which does not exist (hence the correct and accurate error message).
I normally don't bother while doing sysadmin tasks over the console, but from within scripts I do look for text files with "find", and then grep each one:
find /etc -type f -exec grep -nHi -e "widehat" {} \;
Instead of:
grep -nRHi -e "widehat" /etc
I usually don't let grep do the recursion itself. There are usually a few directories you want to skip (.git, .svn...)
You can do clever aliases with stances like that one:
find . \( -name .svn -o -name .git \) -prune -o -type f -exec grep -Hn pattern {} \;
It may seem overkill at first glance, but when you need to filter out some patterns it is quite handy.
Have you tried the -0 option in xargs? Something like this:
ls -r1 | xargs -0 grep 'some text'
Use -I in grep.
Example: grep SEARCH_ME -Irs ~/logs.
I redirect stderr to stdout and then use grep's invert-match (-v) to exclude the warning/error string that I want to hide:
grep -r <pattern> * 2>&1 | grep -v "No such file or directory"
I was getting lots of these errors running "M-x rgrep" from Emacs on Windows with /Git/usr/bin in my PATH. Apparently in that case, M-x rgrep uses "NUL" (the Windows null device) rather than "/dev/null". I fixed the issue by adding this to .emacs:
;; Prevent issues with the Windows null device (NUL)
;; when using cygwin find with rgrep.
(defadvice grep-compute-defaults (around grep-compute-defaults-advice-null-device)
"Use cygwin's /dev/null as the null-device."
(let ((null-device "/dev/null"))
ad-do-it))
(ad-activate 'grep-compute-defaults)
One easy way to make grep return zero status all the time is to use || true
→ echo "Hello" | grep "This won't be found" || true
→ echo $?
0
As you can see the output value here is 0 (Success)

Automatically ignore files in grep

Is there any way I could use grep to ignore some files when searching something, something equivalent to svnignore or gitignore? I usually use something like this when searching source code.
grep -r something * | grep -v ignore_file1 | grep -v ignore_file2
Even if I could set up an alias to grep to ignore these files would be good.
--exclude option on grep will also work:
grep perl * --exclude=try* --exclude=tk*
This searches for perl in files in the current directory excluding files beginning with try or tk.
You might also want to take a look at ack which, among many other features, by default does not search VCS directories like .svn and .git.
find . -path ./ignore -prune -o -exec grep -r something {} \;
What that does is find all files in your current directory excluding the directory (or file) named "ignore", then executes the command grep -r something on each file found in the non-ignored files.
Use shell expansion
shopt -s extglob
for file in !(file1_ignore|file2_ignore)
do
grep ..... "$file"
done
I thinks grep does not have filename filtering.
To accomplish what you are trying to do, you can combine find, xargs, and grep commands.
My memory is not good, so the example might not work:
find -name "foo" | xargs grep "pattern"
Find is flexible, you can use wildcards, ignore case, or use regular expressions.
You may want to read manual pages for full description.
after reading next post, apparently grep does have filename filtering.
Here's a minimalistic version of .gitignore. Requires standard utils: awk, sed (because my awk is so lame), egrep:
cat > ~/bin/grepignore #or anywhere you like in your $PATH
egrep -v "`awk '1' ORS=\| .grepignore | sed -e 's/|$//g' ; echo`"
^D
chmod 755 ~/bin/grepignore
cat >> ./.grepignore #above set to look in cwd
ignorefile_1
...
^D
grep -r something * | grepignore
grepignore builds a simple alternation clause:
egrep -v ignorefile_one|ignorefile_two
not incredibly efficient, but good for manual use

Resources