How to grep out a batch of consecutive lines from a file

How to grep out a batch of consecutive lines from a file - grep

I want to grep out a batch of consecutive lines, that starts at a specific pattern and ends at specific pattern.
E.g. the content of file looks like this :
line 1
line 2
.
.
.
my_start_pattern
.
.
.
my_end_pattern
.
.
.
line n
The output of grep should look like following :
my_start_pattern
.
.
.
my_end_pattern
Thanks.

Don't know if grep can do this, but awk can.
awk '/start pattern/,/end pattern/' data_file_name
(leave off the file name if you want to filter from stdin)

Related

show filename with matching word from grep only

I am trying to find which words happened in logfiles plus show the logfilename for anything that matches following pattern:
'BA10\|BA20\|BA21\|BA30\|BA31\|BA00'
so if file dummylogfile.log contains BA10002 I would like to get a result such as:
dummylogfile.log:BA10002
it is totally fine if the logfile shows up twice for duplicate matches.
the closest I got is:
for f in $(find . -name '*.err' -exec grep -l 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' {} \+);do printf $f;printf ':';grep -o 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' $f;done
but this gives things like:
./register-05-14-11-53-59_24154.err:BA10
BA10
./register_mdw_files_2020-05-14-11-54-32_24429.err:BA10
BA10
./process_tables.2020-05-18-11-18-09_11428.err:BA30
./status_load_2020-05-18-11-35-31_9185.err:BA30
so,
1) there are empty lines with only the second match and
2) the full match (e.g., BA10004) is not shown.
thanks for the help

There are a couple of options you can pass to grep:
-H: This will report the filename and the match
-o: only show the match, not the full line
-w: The match must represent a full word (string build from [A-Za-z0-9_])
If we look at your regex, you use BA01, this will match only BA01 which can appear anywhere in the text, also mid word. If you want the regex to match a full word, it should read BA01[[:alnum:]_]* which adds any sequence of word-constituent characters (equivalent to [A-Za-z0-9_]). You can test this with
$ echo "foo BA01234 barBA012" | grep -Ho "BA01"
(standard input):BA01
(standard input):BA01
$ echo "foo BA01234 barBA012" | grep -How "BA01"
$ echo "foo BA01234 barBA012" | grep -How "BA01[[:alnum:]_]*"
(standard input):BA01234
So your grep should look like
grep -How "\('BA10\|BA20\|BA21\|BA30\|BA31\|BA00'\)[[:alnum:]_]*" *.err

From your example it seems that all files are in one directory. So the following works right away:
grep -l 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' *.err
If the files are in different directories:
find . -name '*.err' -print | xargs -I {} grep 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' {} /dev/null
Explanation: the addition of /dev/null to the filename {} forces grep to report the matching filename

Unable to match patterns from one file line by line with contents of other file | bash shell

I have a file1.txt containing text like:
123 456 789
I need to search these strings line by line in another file2.txt like this:
"123" This is line 1
"456" This is line 2
"789" This is line 3
Matching lines need to be echoed or redirected to file3.txt
I tried couple of ways:
while read -r line; do
grep "$line" -c file2.txt
done < file1.txt
This doesn't give me any matches, although there are some.
I also tried grep like this:
grep -f file1.txt -c file2.txt
which unfortunately doesn't work either.
For all three matches, output should have been:
1
1
1
I am new to shell scripting. Could someone please suggest what is wrong here?
Thanks in advance.

In case you are ok with awk could you please try following then.
awk 'FNR==NR{a[$0];next} ($2 in a)' file1.txt FS="\"" file2.txt
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
FNR==NR{ ##Checking condition FNR==NR which will be TRUE when file1.txt is being read.
a[$0] ##Creating array a with index current line here.
next ##next will skip further statements from here.
}
($2 in a) ##Checking condition if 2nd field is present in array a then print the current line from file2.txt
' file1.txt FS="\"" file2.txt ##Mentioning Input_file names here, where setting FS as " for file2.txt
2nd solution: With changing Input_file(s) sequence of reading.
awk 'FNR==NR{a[$2]=$0;next} ($0 in a){print a[$0]}' FS="\"" file2.txt file1.txt

grep exact match of string with alphabets and numbers

I am using grep to extract lines from file 1 that matches with string in file2. The string in file 2 has both alphabets and numbers. eg;
MSTRG.18691.1
MSTRG.18801.1
I used sed to write word boundaries for all the strings in the file 2.
file 2
\<MSTRG.18691.1\>
\<MSTRG.18801.1\>
and used grep -f file2 file1
but output has
MSTRG.18691.1.2
MSTRG.18801.1.3 also..
I want lines that matches exactly,
MSTRG.18691.1
MSTRG.18801.1
and not,
MSTRG.18691.1.2
MSTRG.18801.1.3
Few lines from my file1
t_name gene_name FPKM TPM
MSTRG.25.1 . 0 0
rna71519 . 93.398872 194.727926057583
gene34024 ND1 2971.72876 6195.77694943117
MSTRG.28.1 . 0 0
MSTRG.28.2 . 0 0
rna71520 . 33.235409 69.2927240732149

Updating the answer
You can use start with ^ and end with $ operator to match start with and begin with. To match exactly MSTRG.18691.1 you can add ^ & $ at both ends and remove the word boundaries, additionally . has special meaning in regex to match exactly . we need to escape that with a backslash \
Example pattern:
^MSTRG\.18691\.1$
^MSTRG\.18801\.1$
file1
MSTRG.18691.1
MSTRG.1311.1
MSTRG.18801.2
MSTRG.18801.3
MSTRG.18801.1.2
MSTRG.18801.1.1
MSTRG.18801.1
PrefixMSTRG.18801.1
Just create a normal file named file1 and paste the above content into it.
file2 (pattern file)
^MSTRG\.18801\.1$
Just create a normal file named file2 and paste the above content into it.
Run the below command from commandline
grep -i --color -f file2 file1
Result:
MSTRG.18801.1
Sed to add changes to the pattern file
Here is the sed command to escape . and add ^ and $ at the beginning and end of the pattern file you already have.
sed -Ee 's/\./\\./g' -e 's/^/\^/g' -e 's/$/\$/g' file2 > file2_updated
-E to support extended regex on BSD sed, you may need to replace -E with -r based on your system's sed
Updated patterns will be saved to file2_updated. Need to use the new pattern file in grep like this
grep -i -f file2_updated file1

The flag you're looking for is -F. From man grep:
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings (instead of regular expressions), separated by newlines, any of which is to be matched.
You can use this quite comfortably in conjunction with -f:
grep -Ff file2 file1
To be clear, this will treat every line of file2 as an exact match against file1.

Retrieve only the matching pattern of a list in another list with grep

I have this kind of files:
file1 :
TMCS09g1008676.1
MAMBO.3.3.2.1
TMCS09g1008678.1
TMCS09g1008678.2
CSH.1.2
TMCS09g1008681.3
TMCS09g1008682.1
TMCS09g1008683
file2:
TMCS09g1008676
MAMBO.3.3.2
TMCS09g1008678
TMCS09g1008679
CSH.1
TMCS09g1008681
TMCS09g1008682
TMCS09g1008683
What I want to do is to retrieve only the matching part of file 2 in file 1. Basically only the "red" part of the terminal showing the matching with the command grep -w -F -f file2 file1
Is there a way to achieve this?
thanks in advance

How can I grep hidden files?

I am searching through a Git repository and would like to include the .git folder.
grep does not include this folder if I run
grep -r search *
What would be a grep command to include this folder?

Please refer to the solution at the end of this post as a better alternative to what you're doing.
You can explicitly include hidden files (a directory is also a file).
grep -r search * .[^.]*
The * will match all files except hidden ones and .[^.]* will match only hidden files without ... However this will fail if there are either no non-hidden files or no hidden files in a given directory. You could of course explicitly add .git instead of .*.
However, if you simply want to search in a given directory, do it like this:
grep -r search .
The . will match the current path, which will include both non-hidden and hidden files.

I just ran into this problem, and based on #bitmask's answer, here is my simple modification to avoid the problem pointed out by #sehe:
grep -r search_string * .[^.]*

Perhaps you will prefer to combine "grep" with the "find" command for a complete solution like:
find . -exec grep -Hn search {} \;
This command will search inside hidden files or directories for string "search" and list any files with a coincidence with this output format:
File path:Line number:line with coincidence
./foo/bar:42:search line
./foo/.bar:42:search line
./.foo/bar:42:search line
./.foo/.bar:42:search line

To prevent matching . and .. which are not hidden files, you can use grep with ls -A like in this example:
ls -A | grep "^\."
^\. states that the first character must be .
The -A or --almost-all option excludes the results . and .. so that only hidden files and directories are matched.

You may want to use this approach, assuming you're searching the current directory (otherwise replace . with the desired directory):
find . -type f | xargs grep search
or if you just want to search at the top level (which is quicker to test if you're trying these out):
find . -type f -maxdepth 1 | xargs grep search
UPDATE: I modified the examples in response to Scott's comments. I also added "-type f".

To search within ONLY all hidden files and directories from your current location:
find . -name ".*" -exec grep -rs search {} \;
ONLY all hidden files:
find . -name ".*" -type f -exec grep -s search {} \;
ONLY all hidden directories:
find . -name ".*" -type d -exec grep -rs search {} \;

All the other answers are better. This one might be easy to remember:
find . -type f | xargs grep search
It finds only files (including hidden) and greps each file.

To find only within a certain folder you can use:
ls -al | grep " \."
It is a very simple command to list and pipe to grep.

In addition to Tyler's suggestion, Here is the command to grep all files and folders recursively including hidden files
find . -name "*.*" -exec grep -li 'search' {} \;

You can also search for specific types of hidden files like so for hidden directory files:
grep -r --include=*.directory "search-string"
This may work better than some of the other options. The other options that worked can be too slow.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

How to grep out a batch of consecutive lines from a file - grep

Don't know if grep can do this, but awk can. awk '/start pattern/,/end pattern/' data_file_name (leave off the file name if you want to filter from stdin)

Related

show filename with matching word from grep only

Unable to match patterns from one file line by line with contents of other file | bash shell

grep exact match of string with alphabets and numbers

Retrieve only the matching pattern of a list in another list with grep

How can I grep hidden files?

Categories

Resources