Restrict grep command to print file name only once - grep

I am interested in finding a string within a specific file type.
The command below serves my purpose.
find /any/path -type f -name "*.log" | xargs grep -B2 -A2 'SUMMARY' {} \;
It gives the following output:
--
/path/to/file.log-line1
/path/to/file.log-line2
/path/to/file.log:text SUMMARY text
/path/to/file.log-line1
/path/to/file.log-line2
--
I would like the file name not to be prepended to each line. Is it possible to have the output as below?
--
/path/to/file.log
line1
line2
text SUMMARY text
line1
line2
--

If you're running this under linux with bash, you could use a bash script like this:
#!/bin/bash
for fn in `grep --include \*.txt -lr 'SUMMARY' /any/path`; do
echo $fn
grep -A2 -B2 'SUMMARY' $fn
done
This will find all files containing the word "SUMMARY" in a recursive manner starting from the directory "/any/path". All matched files are then printed by name and the matched portion is printed with the second grep line.

Related

show filename with matching word from grep only

I am trying to find which words happened in logfiles plus show the logfilename for anything that matches following pattern:
'BA10\|BA20\|BA21\|BA30\|BA31\|BA00'
so if file dummylogfile.log contains BA10002 I would like to get a result such as:
dummylogfile.log:BA10002
it is totally fine if the logfile shows up twice for duplicate matches.
the closest I got is:
for f in $(find . -name '*.err' -exec grep -l 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' {} \+);do printf $f;printf ':';grep -o 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' $f;done
but this gives things like:
./register-05-14-11-53-59_24154.err:BA10
BA10
./register_mdw_files_2020-05-14-11-54-32_24429.err:BA10
BA10
./process_tables.2020-05-18-11-18-09_11428.err:BA30
./status_load_2020-05-18-11-35-31_9185.err:BA30
so,
1) there are empty lines with only the second match and
2) the full match (e.g., BA10004) is not shown.
thanks for the help
There are a couple of options you can pass to grep:
-H: This will report the filename and the match
-o: only show the match, not the full line
-w: The match must represent a full word (string build from [A-Za-z0-9_])
If we look at your regex, you use BA01, this will match only BA01 which can appear anywhere in the text, also mid word. If you want the regex to match a full word, it should read BA01[[:alnum:]_]* which adds any sequence of word-constituent characters (equivalent to [A-Za-z0-9_]). You can test this with
$ echo "foo BA01234 barBA012" | grep -Ho "BA01"
(standard input):BA01
(standard input):BA01
$ echo "foo BA01234 barBA012" | grep -How "BA01"
$ echo "foo BA01234 barBA012" | grep -How "BA01[[:alnum:]_]*"
(standard input):BA01234
So your grep should look like
grep -How "\('BA10\|BA20\|BA21\|BA30\|BA31\|BA00'\)[[:alnum:]_]*" *.err
From your example it seems that all files are in one directory. So the following works right away:
grep -l 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' *.err
If the files are in different directories:
find . -name '*.err' -print | xargs -I {} grep 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' {} /dev/null
Explanation: the addition of /dev/null to the filename {} forces grep to report the matching filename

How to show the full path for a file found using the find command and given to grep to match a string

I have a lot of config.php files in subdirectories below where I invoke the find command and I want to list the path names of only those for which a grep string match is found. Here is what I've tried so far.
find `pwd -P` -name config.php -print -exec grep 'assetBasePath.*cloudfront' {} \;
It partially does the job because find does list the paths for all the config.php files it found, and grep prints the matching line in the file if the pattern does indeed match one.
What I want to achieve is an output similar to the above, but where only the pathnames for the files that have a grep match are shown.
Move the -print to the right-hand end so it is "gated" by the outcome of exec:
find `pwd -P` -name config.php -exec grep 'assetBasePath.*cloudfront' {} \; -print

grep: Find all files containing the word `star`, but not the word `start`

I have a bunch of files: some contain the word star, some contain the word start, some contain both.
I'd like to grep for files that contain the word star, but not the word start.
How can this be accomplished using only grep?
grep has some options for inverting the matches at the line or file level. You want the latter option, with the -L switch. The following will print the names of all the files in a folder that don't contain the text start:
grep -LF start *
-F tells grep that start is a literal string and not a regex. It's optional here, but might speed things up a tiny bit.
You can use the resulting list to search for files that contain star:
grep -lF star $(grep -LF start *)
-l prints only the names of files containing a match, not any line-by-line or match-by-match details. If this is not exactly what you want, man grep is your friend.
This uses an additional shell construct to run the inverted match, but it technically doesn't call any additional programs that aren't grep.
Update
Since you mention wanting to look through all the files starting with a given root folder, change -LF to -LFr. Replace * with your root folder if you don't want to change working directories.
-r tells grep to recurse into directories, and search every file it finds along the way.
With GNU grep for -w:
$ cat file
foo star bar
oof start rab
$ grep -w star *
foo star bar
or if you just want the names of the files containing star:
$ grep -lw star *
file
and to just find files to look in:
$ find . -maxdepth 1 -type f -exec grep -w 'star' {} \;
foo star bar

grep without extended pattern option on finding files that have characters after the pattern

I have set of files in a directory. In those, few files contain a matching pattern config_dict["backup.moduleDir"] and some characters following them. In few other files the pattern appears exactly at the end of the line (no characters followed after the pattern). Note that, the pattern appears exactly one time in all these files.
Now, I want to find those file names which have some characters following a matching pattern. I use the below code:
find . -type f -name "*.py" -exec grep -El 'config_dict\["backup.moduleDir"].+$' {} \;
Actually I want to avoid the use of regex character '+' and extended pattern option -E of grep. So I tried using the grep -v logic by the following 2 ways, but it did not give me the expected result. What really went wrong in the below 2 methods?
grep -vl 'config_dict\["backup.moduleDir"\]$' `find . -type f -name "*.py" -exec grep -l 'backup.moduleDir' {} \;`
find . -type f -name "*.py" -exec grep -l 'backup.moduleDir' {} \; | xargs grep -vl 'config_dict["backup.moduleDir"]$'
Surprisingly in the above working code, I have to escape only the opening square bracket '[' where as escaping is optional for closing square bracket ']' and for double quotes and for dot character between the strings "backup" and "moduleDir". How this is possible?
Using a simple dot without + does the job:
grep 'config_dict\["backup.moduleDir"].' *.py
This will find config_dict["backup.moduleDir"] followed by at least 1 character, in all python scripts.

Using grep to find a string that starts with a character with numbers after

Okay I have a file that contains numbers like this:
L21479
What I am trying to do is use grep (or a similar tool) to find all the strings in a file that have the format:
L#####
The # will be the number. SO an L followed by 5 numbers.
Is this even possible in grep? Should I load the file and perform regex?
You can do this with grep, for example with the following command:
grep -E -o 'L[0-9]{5}' name_of_file
For example, given a file with the text:
kasdhflkashl143112343214L232134614
3L1431413543454L2342L3523269ufoidu
gl9983ugsdu8768IUHI/(JHKJASHD/(888
The command above will output:
L23213
L14314
L35232
If it is just in a single file, you can do something along the lines of:
grep -e 'L[0-9]{5}' filename
If you need to search all files in a directory for these strings:
find . -type f | xargs grep -e 'L[0-9]{5}'

Resources