Find specific patterns in logfile with grep - grep

I try to grab some activities in a logfile with grep.
The logfile looks like the line below:
[07/16/2019 14:09:26:516 CEST] 00018e0a main I [APPName][DEBUG][Various Activity Name][SERVICE] node: <resultSet recordCount="3" columnCount="6">
So my target is to grab over the logfile and get all activity names (various and sometimes unknown names).
I tried with regex but I can't reach the final goal :-P.
How can i grab like (here comes my pseudoCode):
grep "[AppName][DEBUG] [*]" logfile.log

You're probably better off using sed for this. Just make sure that you are escaping characters properly. Something like this should work and you can pipe it to sort and uniq to get a unique list of activity names:
sed 's/.*\[APPName\]\[DEBUG\]\[\([^]]*\)\].*/\1/' logfile.log|sort|uniq

Related

how to print all match using grep

I have a txt file that has only information about location (location.txt)
Another large txt file (all.txt) has a lot of information like id , and a location.txt is subset of all.txt ( some records common in both )
I want to search the location.txt in another file with grep (all.txt)
and print all common records ( but all information like all.txt )
I try to grep by :
grep -f location.txt all.txt
the problem grep just give me the last location not all locations
how can I print all location?
I'm assuming you mean to use one of the files as a set of patterns for grep. If this is the case, you seem to be looking for a way to print all lines in one file not found in the other and this is what you want:
grep -vFf file_with_patterns other_file
Explanation
-F means to interpret the pattern(s) literally, giving no particular meaning to regex metacharacters (like * and +, for example)
-f means read regex patterns from the file named as argument (file_with_patterns in this case).

Regex different hashes

so I'm struggling with regex. I'll start with what I want to achieve and then proceed to what I have "so far".
So for example I have commit name lines
merge(#2137): done something
Merge pull request #420 from Example/branch
feat(): done something [#2137JDN]
merge(#690): feat(): done something [#2137JDN]
And I want to grep only by PR ID, or if it's not there then it'd search by that second hash
#2137
#420
#2137JDN
#690
For now I have this regex, but it's not perfect
/(\(|\s|\[)(#\d+|#.+)(\)|\s|\])/g
because it's capturing this
(#2137)
\s#420\s
[#2137JDN]
(#690)[#2137JDN]
How I can improve it to get what I want exactly?
You can use the #[\dA-Z]+ pattern to grep only hashes.
command | grep -Po "#[\dA-Z]+"
Which returns the all matched strings (in our case - hashes)
#2137
#420
#2137JDN
#690
#2137JDN
Unfortunately, grep does not support non-greedy feature. See this answer.

How can I find files that match a two-line pattern using grep?

I created a test file with the following:
<cert>
</cert>
I'm now trying to find this with grep and the following command, but it take forever to run.
How can I search quickly for files that contain adjacent lines like these?
tr -d '\n' | grep '<cert></cert>' test.test
So, from the comments, you're trying to get the filenames that contain an empty <cert>..</cert> element. You're using several tools wrong. As #iiSeymour pointed out, tr only reads from standard input-- so if you want to use it to select from lots of filenames, you'll need to use a loop. grep prints out matching lines, not filenames; though you could use grep -l to see the filenames instead.
But you're only joining lines because grep works one line at a time; so let's use a better tool. Here's how to search with awk:
awk '/<cert>/ { started=1; }
/<\/cert>/ { if (started) { print FILENAME; nextfile;} }
!/<cert>/ { started = 0; }' file1 file2 *.txt
It checks each line and keeps track of whether the previous line matched <cert>. (!/pattern/ sets the flag back to zero on lines not matching /pattern/.) Call it with all your files (or with a wildcard like *.txt).
And a friendly suggestion: Next time, try each command separately (you've been stuck on this for hours and you still don't know what grep does?). And have a quick look at the manual for the tools you want to use. Unix tools are usually too complex for simple trial and error.

How to use a whitelist with grep

I have a list of email addresses. I also have a list of common first and last names. I want to filter the email list against the one with common first and last names, thus only printing emails with either a common first and / or last name in the output file.
So, I tried:
cat file | egrep -e -i < whitelist | tee emails_with_common_first_and_last_names.txt
At first, this seemed like it was working. Then, after examining the output, it did not seem to do anything.
wc -l input output
This revealed that nothing was filtered.
So, how else can I do this or what am I doing incorrectly?
Here is a sample of the file that I would like filtered:
aynz#falskdf.com
8zlkhsdf0#fmail.com
afjsg#domain.com
Here is a sample of the whitelist that I would like to use as a reference to filer the file:
ALEX
johnson
WINTERS
miles
christina
tonya
jackson
schmidt
jake
So, if an email contains any of these, grep or whatever needs to print it to a output file.

Opposite of "only-matching" in grep?

Is there any way to do the opposite of showing only the matching part of strings in grep (the -o flag), that is, show everything except the part that matches the regex?
That is, the -v flag is not the answer, since that would not show files containing the match at all, but I want to show these lines, but not the part of the line that matches.
EDIT: I wanted to use grep over sed, since it can do "only-matching" matches on multi-line, with:
cat file.xml|grep -Pzo "<starttag>.*?(\n.*?)+.*?</starttag>"
This is a rather unusual requirement, I don't think grep would alternate the strings like that. You can achieve this with sed, though:
sed -n 's/$PATTERN//gp' file
EDIT in response to OP's edit:
You can do multiline matching with sed, too, if the file is small enough to load it all into memory:
sed -rn ':r;$!{N;br};s/<starttag>.*?(\n.*?)+.*?<\/starttag>//gp' file.xml
You can do that with a little help from sed:
grep "pattern" input_file | sed 's/pattern//g'
I don't think there is a way in grep.
If you use ack, you could output Perl's special variables $` and $' variables to show everything before and after the match, respectively:
ack string --output="\$`\$'"
Similarly if you wanted to output what did match along with other text, you could use $& which contains the matched string;
ack string --output="Matched: $&"

Resources