grep specific pattern from a log file - grep

I am passing all my svn commit log messages to a file and want to grep only the JIRA issue numbers from that.
Some lines might have more than 1 issue number, but I want to grab only the first occurrence.
The pattern is XXXX-999 (number of alpha and numeric char is not constant)
Also, I don't want the entire line to be displayed, just the JIRA number, without duplicates. I use the following command but it didn't work.
Could someone help please?
cat /tmp/jira.txt | grep '^[A-Z]+[-]+[0-9]'
Log file sample
------------------------------------------------------------------------
r62086 | userx | 2015-05-12 11:12:52 -0600 (Tue, 12 May 2015) | 1 line
Changed paths:
M /projects/trunk/gradle.properties
ABC-1000 This is a sample commit message
------------------------------------------------------------------------
r62084 | usery | 2015-05-12 11:12:12 -0600 (Tue, 12 May 2015) | 1 line
Changed paths:
M /projects/training/package.jar
EFG-1001 Test commit
Output expected:
ABC-1000
EFG-1001

First of all, it seems like you have the second + in the wrong place, it should be at the end of [0-9] expression.
Second, I think all you need to do this is use the -o option to grep (to display only the matching portion of the line), then pipe the grep output through sort -u, like this:
cat /tmp/jira.txt | grep -oE '^[A-Z]+-[0-9]+' | sort -u
Although if it were me, I'd skip the cat step and just give the filename to grep, as so:
grep -oE '^[A-Z]+-[0-9]+' /tmp/jira.txt | sort -u
Six of one, half a dozen of the other, really.

Related

Grep finds the word and then stucks (bash)

I have a loop (while, extracting 2 variables) where I found one command is not working. Even when I put the command in the console directly (subsituting by my own the variable) it gives the result but continue working without any advance.
The command's objective is to find in a big file.gct, in specific in its first three lines, an object obtained from other file and then print the finding and everything before in that line.
If someone know why it stucks and how to fix it or even an alternative that works well in loops and does not demands more RAM's use it would be appreciated.
head -3 file_2 | grep -E -o ".{0,1000}$variable."
Kind of an example as how it looks the big file (file_2):
head -3 file_2
| #1.2 |
| 57000 | 17300 |
|Irrelevant|Irrelevant2| DATA-B12-18 | DATA-Y17-72 | DATA-A12-44 | .... |
When I run in the terminal: head -3 file_2 | grep -E -o ".{0,1000}DATA-B12-18"
the output is:
Irrelevant Irrelevant2 DATA-B12-18 and then stacks.

print the rest of input along with matching line

I am new to linux and I am experimenting with basic terminal commands. I found out that I can list all users using compgen -u but what if I only want to display the bottom line outputs ?
Ok lets say the output of compgen -u goes like this:
extra
extra
extra
extra
extra
extra
extra
extra
extra
John
William
Kate
Harold
I can only use grep to find a single text (ex. compgen -u | grep John). But what if I want to use grep to display John as well as all the remaining entries after it ?
sed or awk solution would be easier, but if you can only use grep, then the option --after-context (or -A) might do:
grep -A 5 John file
The drawback is that you need to know the number of lines to display after the matching (or use an arbitrary big number for the rest of the file).
compgen -u | grep -A$(compgen -u| wc -l) John
Explanation:
From man grep
-A NUM, --after-context=NUM
Print NUM lines of trailing context after matching lines. Places a line containing a group separator (described under --group-separator) between
contiguous groups of matches.
grep -A -- print number of rows after pattern
$() -- Execute unix command
compgen -u| wc -l --> Get total number of rows of output of command.
You can use the following one-liner :
n=$( compgen -u | grep -n John | head -1 | cut -d ":" -f 1 ) && compgen -u | tail -n +$n
This finds out the line number for first occurrence of John, and prints everything starting that line.

grep last match and it's following lines

I've learnt how to grep lines before and after the match and to grep the last match but I haven't discovered how to grep the last match and the lines underneath it.
The scenario is a server log.
I want to list a dynamic output from a command. The command is likely to be used several times in one server log. So I imagine the match would be the command and somehow grep can use -A with some other flag or variation of a tail command, to complete the outcome I'm seeking.
The approach I would take it to reverse the problem as it's easier to find the first match and print the context lines. Take the file:
$ cat file
foo
1
2
foo
3
4
foo
5
6
Say we want the last match of foo and the following to lines we could just reverse the file with tac, find the first match and n lines above using -Bn and stop using -m1. Then simple re-reverse the output with tac:
$ tac file | grep foo -B2 -m1 | tac
foo
5
6
Tools like tac and rev can make problems that seem difficult much easier.
using awk instead:
awk '/pattern/{m=$0;l=NR}l+1==NR{n=$0}END{print m;print n}' foo.log
small test, find the last line matching /8/ and the next line of it:
kent$ seq 20|awk '/8/{m=$0;l=NR}l+1==NR{n=$0}END{print m;print n}'
18
19

Is there a way in grep to find out how many lines matched the grep result?

Suppose I write a grep query to find out the occurrence of a method call on an object like this:
// might not be accurate, but irrelevant
grep -nr "[[:alnum:]]\.[[:alnum:]](.*)" .
This would give many results. How to find out how many such results are obtained?
What about using | wc -l to count the number of result lines?
What about
man grep | grep "count"
It outputs
-c, --count
Suppress normal output; instead print a count of matching lines for each input file. [...]
Previous answers are OK, I just want to put it into command line instructions in order to have copy-paste versions (from explicit to simplest) for the future:
grep --count "PATTERN" FILE
Is exactly the same as:
grep -c "PATTERN" FILE
And it is equivalent to:
grep "PATTERN" FILE | wc -l
As a bonus, below i give you a version where a file with a list of patterns is used.
grep -count --file=PATTERNFILE FILE
or simply
grep -cf PATTERNFILE FILE

basic grep

I have a large file where each line contains a substring such as ABC123. If I execute
grep ABC file.txt
or
grep ABC1 file.txt
I get those lines back as expected, but if I execute
grep ABC12 file.txt
grep fails to find the corresponding lines.
This seems pretty trivial functionality, but I'm not a heavy user of grep so perhaps I'm missing some gotcha.
Use something like
od -x -a < filename
to dump out the file contents in hex. That'll immediately show you if what you have in your file is what you expect. Which I suspect it isn't :-)
Note: od has lots of useful options to help you here. Too many to list, in fact.
Is there a chance your file contains some hidden character, such as 0x00 ?
This doesn't make sense. Are you sure the file contains "ABC123"?
You can verify this by running following command in a shell
echo "ABC123" | grep ABC12
If the lines contain ABC123, then "grep ABC12" should get them. Do you perhaps mean that you want to match several different strings, such as ABC1, ABC2 and ABC3? In that case you can try this:
grep -E 'ABC1|ABC2|ABC3'
I'm not sure what the problem is.. grep works exactly as it should.. For example, the contents of my test file:
$ cat file.txt
ABC
ABC1
ABC12
ABC123
..and grep'ing for ABC, ABC1, ABC12, ABC123:
$ grep ABC file.txt
ABC
ABC1
ABC12
ABC123
$ grep ABC1 file.txt
ABC1
ABC12
ABC123
$ grep ABC12 file.txt
ABC12
ABC123
$ grep ABC123 file.txt
ABC123
grep is basically a filter, any line containing the first argument (ABC, or ABC1 etc) will be displayed. If it doesn't contain the entire string, it will not be displayed

Resources