I am trying to look for patterns in a file. The looks like this:
aaa;
bbb;
If I try the following it does not stop:
cat test | tr -d '\n' | ack -1 'aa.*;'
aaa;bbb;
Is there a way to stop with aaa;?
Your regex is greedy, so it includes all text after the second a until the final semi-colon. If you refine the pattern to NOT be greedy (with a question mark) and use the -o option, you'll get what you expect:
$cat test.txt | tr -d '\n' | ack -o 'aa.*?;'
aaa;
See more recipes/details here
This is doing the trick:
ack -1 -io 'a[^;]+;'
Related
I got .txt file with city names, each in separate line. Some of them are few words with one or multiple spaces or words connected with '-'. I need to create bash command which will echo those lines out. Currently I'm using cat piped with grep but I can't get both spaces and dash into one search and I had problems with checking for multiple spaces.
print lines with dash:
cat file.txt | grep ".*-.*"
print lines with spaces:
cat file.txt | grep ".*\s.*"
tho when I try to do:
cat file.txt | grep ".*\s+.*"
I get nothing.
Thanks for help
Something like that should work:
grep -E -- ' |\-' file.txt
Explanation:
-E: to interpret patterns as extended regular expressions
--: to signify the end of command options
' |\-': the line contains either a space or a dash
This does not directly address your question, but is too much to put in a comment.
You don't need the .* in your patterns. .* at the beginning or end of a pattern is useless, because it means "0 or more of any character" and so will always match.
These lines are all identical:
cat file.txt | grep ".*-.*"
cat file.txt | grep "-.*"
cat file.txt | grep "-"
Plus you don't need to cat and pipe:
grep "-" file.txt
When grep pattern matches, the default action is to print the whole line, so .* in all your patterns are redundant, you may delete them. Also, you don't have to use cat file | as you may specify the file to grep directly after pattern, i.e. grep 'pattern' file.txt.
Here are some more details:
grep ".*-.*" = grep -- "-" - returns any lines having a - char (-- singals the end of options, the next thing is the pattern)
grep ".*\s.*" = grep "\s" - matches and returns lines containing a whitespace char (only GNU grep)
grep ".*\s+.*" = grep "\s+" - returns line containing a whitespace followed with a literal + char (since you are using POSIX BRE regex here the unescaped + matches a literal plus symbol).
You want
grep "[[:space:]-]" file.txt
See the online demo:
#!/bin/bash
s='abc - def
ghi
jkl mno'
grep '[[:space:]-]' <<< "$s"
Output:
abc - def
jkl mno
The [[:space:]-] POSIX BRE and ERE (enabled with -E option) compliant pattern matches either any whitespace (with the [:space:] POSIX character class) or a hyphen.
Note that [\s-] won't work since \s inside a bracket expression is not treated as a regex escape sequence but as a mere \ or s.
I have a bunch of strings that I have to fetch the 'port_num' from -
"76 : client=new; tags=circ, LINK; port_num=switch01; far_port=Gi1/0"
The word might be in a different place in the string and it might be a different length, but it always says 'port_num=' before it and ';' after it...
I only want this bit- 'switch01'
Currently I use-
| grep -Eo 'port_num=.+' | cut -d"=" -f2 | cut -d";" -f1'
But there has got to be a better way
You can try grep -oP '(?<=port_num=).+(?=;)', if you run this:
echo "76 : client=new; tags=circ, LINK; port_num=switch01; far_port=Gi1/0" \
| grep -oP '(?<=port_num=).+(?=;)'
result will be:
switch01
Updated answer: grep -oP '(?<=port_num=)[^;]+(?=;)'
This is what I would use:
... | grep -E 'port_num=.+' | sed 's/^.*port_num=\([^;]*\).*$/\1/'
This works with or without the -o on grep, and the availability of -P will depend on the version of grep you have. (e.g., my grep does not have it). I'm not saying the other answers that rely on -P aren't any good -- they look fine to me. But grep -P will be less portable.
IMHO, piping grep with sed allows each utility to do what it specializes in -- grep is for selecting lines, sed is for modifying lines.
This can be done in a simple sed command:
s="76 : client=new; tags=circ, LINK; port_num=switch01; far_port=Gi1/0"
sed 's/.*port_num=\([^;]*\);.*/\1/' <<< "$s"
switch01
... | grep -Po 'port_num.+(?=;)'
This uses grep's Perl Compatible Regular Expression (PCRE) syntax. The (?=;) is a look-ahead assertion which looks for a match with ";" but doesn't include it in the matched output.
This produces:
port_num=switch01
As #Vladimir Kovpak noted, if you want to exclude the "port_num=" string from this output, add a look-behind assertion:
... | grep -Po '(?<=port_num).+(?=;)'
It is a simple question but I didn't find an answer...
I have a tab separated files with many rows and different number of fields in each row. Like this:
a1_j a2_f a3_f a10_g a8_t a2_e
a2_j
a6_h a8_o
a9_g
I just want to print those fields that start with a2, but not the whole line, just the matched fields.Like this:
a2_f
a2_e
a2_j
I tried with awk, with no success.
I would use grep to do this:
grep -o 'a2_[a-z]' file
The -o switch means that only matches are printed, each on a separate line.
You could loop through all the fields with a for loop, or use fmt to put all the fields on 1 line:
~$ fmt -w1 f
a1_j
a2_f
a3_f
a10_g
a8_t
a2_e
a2_j
a6_h
a8_o
a9_g
and then grep with grep or if you want to use awk:
~$ fmt -w1 f | awk '/a2/{print}'
a2_f
a2_e
a2_j
With GNU awk for multi-char RS and \s:
$ awk -v RS='\\s' '/^a2/' file
a2_f
a2_e
a2_j
I would like to grep a specific word 'foo' inside specific files, then get the N lines around my match and show only the blocks that contain a second grep.
I found this but it doesn't really work...
find . | grep -E '.*?\.(c|asm|mac|inc)$' | \
xargs grep --color -C3 -rie 'foo' | \
xargs -n1 --delimiter='--' | grep --color -l 'bar'
For instance I have the file 'a':
a
b
c
d
bar
f
foo
g
h
i
j
bar
l
The file b:
a
bar
c
d
e
foo
g
h
i
j
k
I expect this for grep -c2 on both files because bar is contained in the -c2 range of foo. I do not get any match for ./bar because bar is not in the range -c2 of foo...
--
./foo- bar
./foo- f
./foo- **foo**
./foo- g
./foo- h
--
Any ideas?
You could do this pretty simply with a "while read line" loop:
find -regextype posix-extended -regex "./file[a-z]" | while read line; do grep -nHC2 "foo" $line | grep --color bar; done
Output:
./filea-5-bar
./filec-46-... host pwns.me [94.23.120.252]: 451 4.7.1 Local bar
configuration error ...
In this example, I created the following files:
filea - your example a
fileb - your example b
filec - some random exim log output with foo and bar tossed in 2 lines apart
filed - the same exim log output, but with foo and bar tossed in 3 lines apart
You could also pipe the output after done, to alter the format:
; done | sed 's/-([0-9]{1,6})-/: line: \1 ::: /'
Formatted output
./filea: line: 5 ::: bar
./filec: line: 46 ::: ... host pwns.me [94.23.120.252]: 451 4.7.1 Local bar configuration error ...
I think I only understand the first line of your question and this does what I think you mean!
#!/bin/bash
N=2
pattern1=a
pattern2=z
matchinglines=$(awk -v p="$pattern1" '$0~p{print NR}' file) # Generate array of matching line numbers
for x in ${matchinglines[#]}
do
((start=x-N))
[[ $start -lt 1 ]] && start=1 # Avoid passing negative line nmumbers to sed
((end=x+N))
echo DEBUG: Checking block between lines $start and $end
sed -ne "${start},${end}p" file | grep -q "$pattern2"
[[ $? -eq 0 ]] && sed -ne "${start},${end}p" file
done
You need to set pattern1 and pattern2 at the start of the script. It basically does some awk to build an array of the line numbers that match your first pattern. Then it loops through the array and sets the start and end range to +/-N either side of each matching line number. It then uses sed to extraact that block and passes it through grep to see if it contains pattern2 printing it if it does. It may not be the most efficient, but it is easy enough to understand and maintain.
It assumes your file is called file
pipe it twice
grep "[^foo\n]" | grep "\n{ntimes}foo\n{ntimes}"
I have a file that possibly contains bad formatting (in this case, the occurrence of the pattern \\backslash). I would like to use grep to return only the line numbers where this occurs (as in, the match was here, go to line # x and fix it).
However, there doesn't seem to be a way to print the line number (grep -n) and not the match or line itself.
I can use another regex to extract the line numbers, but I want to make sure grep cannot do it by itself. grep -no comes closest, I think, but still displays the match.
try:
grep -n "text to find" file.ext | cut -f1 -d:
If you're open to using AWK:
awk '/textstring/ {print FNR}' textfile
In this case, FNR is the line number. AWK is a great tool when you're looking at grep|cut, or any time you're looking to take grep output and manipulate it.
All of these answers require grep to generate the entire matching lines, then pipe it to another program. If your lines are very long, it might be more efficient to use just sed to output the line numbers:
sed -n '/pattern/=' filename
Bash version
lineno=$(grep -n "pattern" filename)
lineno=${lineno%%:*}
I recommend the answers with sed and awk for just getting the line number, rather than using grep to get the entire matching line and then removing that from the output with cut or another tool. For completeness, you can also use Perl:
perl -nE 'say $. if /pattern/' filename
or Ruby:
ruby -ne 'puts $. if /pattern/' filename
using only grep:
grep -n "text to find" file.ext | grep -Po '^[^:]+'
You're going to want the second field after the colon, not the first.
grep -n "text to find" file.txt | cut -f2 -d:
To count the number of lines matched the pattern:
grep -n "Pattern" in_file.ext | wc -l
To extract matched pattern
sed -n '/pattern/p' file.est
To display line numbers on which pattern was matched
grep -n "pattern" file.ext | cut -f1 -d: