How to exclude from grep double colons? - grep

I'm trying to find lines with words not preceded by double colons (::).
Example
void myClass::doMything() // I don't want this line
myObj->doMyThing() // I want this line
My goal is to get the lines where some methods are used, but not where the methods are defined.
I try with this command :
grep --color=always -rwna "methodName" --include=*.cpp | grep -v "::methodName"
but it doesn't work : it keeps extracting also lines containing
::methodName
I've also tried by writing
grep --color=always -rwna "methodName" --include=*.cpp | grep -v "\:\:methodName"
egrep --color=always -rwna "methodName" --include=*.cpp | egrep -v "\:\:methodName"
but neither works.
What should I do ?

Although grep is probably most common used tool among all linux CLI tools and is used by every1 and everywhere... still doesnt mean its perfect. The thing you are trying to achieve is not achievable with basic grep's regex - you need python/perl regex here.
As a workaround (I assume you are trying to only find line where method is invoked) you can try:
grep -Eno "(::)?methodName" your_input_files | grep -v "::methodName"
-n to prints line number and I believe it will give convenience to you
-o to prints only matched part, but I use it here to split output - to have each match in separate line (if you have 5x methodName in line of code you will have 5 lines in grep's output)
(::)? to find distinguish if its declaration or invokation of methodName, we will need it when 2nd grep comes to play...
grep -v ...and here it comes, to get rid of what you dont want
I guess you want to use maaaaany times so you can even try to make a function into your .bashrc
find_invocations () {
# below example goes through current dir, but you can improve it :)
grep --color=yes -Eno "(::)?$1" * 2>/dev/null | grep -v "::$1"
}
in above function you might go risky and use $1.* instead of $1 but an unpleasant case is if you have both methodname and ::methodName in same line AFAIR my C++ lessons (ages ago - anno 2010) methodName::methodName is a constructor...
...sorry for bad english

I've finally managed to make it work.
I've tried linux_beginner's suggestion:
grep -Eno '(::)?myMethodName' path/to/one/of/the/files.cpp | grep -v '::myMethodName'
with a single file and this works. (I found I prefer not using the o option, because I also want to se how it's used).
In this search I need anyway to use multiple files. So I've also tried to include more files :
grep -Eno '(::)?myMethodName' --include=*.cpp | grep -v '::myMethodName'
but in this case it remains like stuck in the search (maybe it triggers some slow scripting ? perl or python ?).
I've checked RavinderSingh13's command. Taken in a single instance, it can capture the lines with double colon(and only them, correctly), both on single file or in multiple files :
grep -rna '::myMethodName' path/to/one/of/the/file.cpp
grep -rna '::myMethodName' --include=*.cpp
but there must not be the -w switch, so the following:
grep -rna '::myMethodName' path/to/one/of/the/file.cpp
grep -rna '::myMethodName' --include=*.cpp
don't get any result.
RavinderSingh13's suggestion put inside the pipelining doesn't manage to filter out the double colon lines (my original goal), either with single or multiple files :
grep -rwna 'myMethodName' path/to/one/of/the/files.cpp | grep -v '::[[:alpha:]]+'
-> extracts both myMethodName and ::myMethodName from the chosen file
grep -rwna 'myMethodName' --include=*.cpp | grep -v '::[[:alpha:]]+'
-> extracts both myMethodName and ::myMethodName from all the cpp files
Now, how I could solve:
usually, when I concatenate grep commands I also add to the first of them the switch --color=always, which preserves results coloring also across the piping of multiple commands.
But that... was the culprit !
i.e., doing
grep --color=always -rwna 'myMethodName' --include=*.cpp | grep -v '::myMethodName'
preserves the color in results, but sadly fails to exclude lines containing ::myMethodName, while
grep -rwna 'myMethodName' --include=*.cpp | grep -v '::myMethodName'
gives colorless but correct results (manages to filter out double column lines).
The distribution on which I've experimented these codes and behaviours is Ubuntu 20.04.1 LTS.
Grep version : grep (GNU grep) 3.4
Thanks everybody for the interest.

Related

Find parameters of a method with grep

I need some help with a grep command (in the Bash).
In my source files, I want to list all unique parameters of a function. Background: I want to search through all files, to see, which permissions ([perm("abc")] are used.
Example.txt:
if (x) perm("this"); else perm("that");
perm("what");
I'd like to have my grep output:
this
that
what
If I do my grep with this search expression
perm\(\"(.*?)\"\)
I'll get perm("this), perm("that"), etc. but I'd like to have just the permissions: this and that and what.
How can I do that?
Use a look-behind:
$ grep -Po '(?<=perm\(")[^"]*' file
this
that
what
This looks for all the text occurring after perm(" and until another " is found.
Note -P is used to allow this behaviour (it is a Perl regex) and -o to just print the matched item, instead of the whole line.
Here is a gnu awk version (due to multiple characters in RS)
awk -v RS='perm\\("' -F\" 'NR>1 {print $1}' file
this
that
what

How to grep in one line starting from particular string to end with particular string

I want to grep "[calleruid]=aab01b055-89e3-49f3-839e-507bb128d07e&smscresponse"
in Below file
2014-10-15 18:38:32,831 plivo-rest[2781]: INFO: Fetching GET http://*******/outbound_callback.aspx with smscresponse[to]=8912722fsf9&smscresponse[ALegUUID]=5bb516fsd64-546c-11e4-879f-551816a551303677&smscresponse[calluid]=aab01b055-89e3-49f3-839e-507bb128d07e&smscresponse[direction]=outbosund&smscresfdsponse[endreason]=UNALLOCATED_NUMBER&smscresponse[from]=83339995896999&smscresponse[starttime]=0&smscresponse[ALegRequestUUID]=5bb4bafc-546c-11e4-891d-000c29ec6e41&smscresponse[RequestUUID]=5bb4bafc-546c-11e4-891d-000c29ec6e41&smscresponse[callstatus]=completed&smscresponse[endtime]=1413378509&smscresponse[ScheduledHangupId]=5bb4c15a-546c-11e4-891d-000c29ec6e41&smscresponse[event]=missed_call_hangup
I used this command
$ grep -oP '(calluid).*$'
this greps upto end of file
I used this command
$ grep -oP '(calluid).{40}'
it fetches 40 characters but i have 1000's of calleruid's so each have different no.s of characters
So please guide me to grep exact callerid data
Use a lookahead to force the regex engine to do the match upto a specific character or a boundary.
$ grep -oP '\[calluid\][^\]\[]*(?=\[|$)' file
[calluid]=aab01b055-89e3-49f3-839e-507bb128d07e&smscresponse
Here is an gnu awk (due to multiple characters in RS) version:
awk -v RS="[[]calluid[]]=" -F[ 'NR==2 {print $1}' file
aab01b055-89e3-49f3-839e-507bb128d07e&smscresponse
You can also set RS like this: RS="\\\[calluid]="

Simple Grep Issue

I am trying to parse items out of a file I have. I cant figure out how to do this with grep
here is the syntax
<FQDN>Compname.dom.domain.com</FQDN>
<FQDN>Compname1.dom.domain.com</FQDN>
<FQDN>Compname2.dom.domain.com</FQDN>
I want to spit out just the bits between the > and the <
can anyone assist?
Thanks
grep can do some text extraction. however not sure if this is what you want:
grep -Po "(?<=>)[^<]*"
test
kent$ echo "<FQDN>Compname.dom.domain.com</FQDN>
dquote>
dquote> <FQDN>Compname1.dom.domain.com</FQDN>
dquote>
dquote> <FQDN>Compname2.dom.domain.com</FQDN>"|grep -Po "(?<=>)[^<]*"
Compname.dom.domain.com
Compname1.dom.domain.com
Compname2.dom.domain.com
Grep isn't what you are looking for.
Try sed with a regular expression : http://unixhelp.ed.ac.uk/CGI/man-cgi?sed
You can do it like you want with grep :
grep -oP '<FQDN>\K[^<]+' FILE
Output:
Compname.dom.domain.com
Compname1.dom.domain.com
Compname2.dom.domain.com
As others have said, grep is not the ideal tool for this. However:
$ echo '<FQDN>Compname.dom.domain.com</FQDN>' | egrep -io '[a-z]+\.[^<]+'
Compname.dom.domain.com
Remember that grep's purpose is to MATCH things. The -o option shows you what it matched. In order to make regex conditions that are not part of the expression that is returned, you'd need to use lookahead or lookbehind, which most command-line grep does not support because it's part of PCRE rather than ERE.
$ echo '<FQDN>Compname.dom.domain.com</FQDN>' | grep -Po '(?<=>)[^<]+'
Compname.dom.domain.com
The -P option will work in most Linux environments, but not in *BSD or OSX or Solaris, etc.

Is there a way in grep to find out how many lines matched the grep result?

Suppose I write a grep query to find out the occurrence of a method call on an object like this:
// might not be accurate, but irrelevant
grep -nr "[[:alnum:]]\.[[:alnum:]](.*)" .
This would give many results. How to find out how many such results are obtained?
What about using | wc -l to count the number of result lines?
What about
man grep | grep "count"
It outputs
-c, --count
Suppress normal output; instead print a count of matching lines for each input file. [...]
Previous answers are OK, I just want to put it into command line instructions in order to have copy-paste versions (from explicit to simplest) for the future:
grep --count "PATTERN" FILE
Is exactly the same as:
grep -c "PATTERN" FILE
And it is equivalent to:
grep "PATTERN" FILE | wc -l
As a bonus, below i give you a version where a file with a list of patterns is used.
grep -count --file=PATTERNFILE FILE
or simply
grep -cf PATTERNFILE FILE

How to truncate long matching lines returned by grep or ack

I want to run ack or grep on HTML files that often have very long lines. I don't want to see very long lines that wrap repeatedly. But I do want to see just that portion of a long line that surrounds a string that matches the regular expression. How can I get this using any combination of Unix tools?
You could use the grep options -oE, possibly in combination with changing your pattern to ".{0,10}<original pattern>.{0,10}" in order to see some context around it:
-o, --only-matching
Show only the part of a matching line that matches PATTERN.
-E, --extended-regexp
Interpret pattern as an extended regular expression (i.e., force grep to behave as egrep).
For example (from #Renaud's comment):
grep -oE ".{0,10}mysearchstring.{0,10}" myfile.txt
Alternatively, you could try -c:
-c, --count
Suppress normal output; instead print a count of matching lines
for each input file. With the -v, --invert-match option (see
below), count non-matching lines.
Pipe your results thru cut. I'm also considering adding a --cut switch so you could say --cut=80 and only get 80 columns.
You could use less as a pager for ack and chop long lines: ack --pager="less -S" This retains the long line but leaves it on one line instead of wrapping. To see more of the line, scroll left/right in less with the arrow keys.
I have the following alias setup for ack to do this:
alias ick='ack -i --pager="less -R -S"'
grep -oE ".\{0,10\}error.\{0,10\}" mylogfile.txt
In the unusual situation where you cannot use -E, use lowercase -e instead.
Explanation:
cut -c 1-100
gets characters from 1 to 100.
The Silver Searcher (ag) supports its natively via the --width NUM option. It will replace the rest of longer lines by [...].
Example (truncate after 120 characters):
$ ag --width 120 '#patternfly'
...
1:{"version":3,"file":"react-icons.js","sources":["../../node_modules/#patternfly/ [...]
In ack3, a similar feature is planned but currently not implemented.
Taken from: http://www.topbug.net/blog/2016/08/18/truncate-long-matching-lines-of-grep-a-solution-that-preserves-color/
The suggested approach ".{0,10}<original pattern>.{0,10}" is perfectly good except for that the highlighting color is often messed up. I've created a script with a similar output but the color is also preserved:
#!/bin/bash
# Usage:
# grepl PATTERN [FILE]
# how many characters around the searching keyword should be shown?
context_length=10
# What is the length of the control character for the color before and after the
# matching string?
# This is mostly determined by the environmental variable GREP_COLORS.
control_length_before=$(($(echo a | grep --color=always a | cut -d a -f '1' | wc -c)-1))
control_length_after=$(($(echo a | grep --color=always a | cut -d a -f '2' | wc -c)-1))
grep -E --color=always "$1" $2 |
grep --color=none -oE \
".{0,$(($control_length_before + $context_length))}$1.{0,$(($control_length_after + $context_length))}"
Assuming the script is saved as grepl, then grepl pattern file_with_long_lines should display the matching lines but with only 10 characters around the matching string.
I put the following into my .bashrc:
grepl() {
$(which grep) --color=always $# | less -RS
}
You can then use grepl on the command line with any arguments that are available for grep. Use the arrow keys to see the tail of longer lines. Use q to quit.
Explanation:
grepl() {: Define a new function that will be available in every (new) bash console.
$(which grep): Get the full path of grep. (Ubuntu defines an alias for grep that is equivalent to grep --color=auto. We don't want that alias but the original grep.)
--color=always: Colorize the output. (--color=auto from the alias won't work since grep detects that the output is put into a pipe and won't color it then.)
$#: Put all arguments given to the grepl function here.
less: Display the lines using less
-R: Show colors
S: Don't break long lines
Here's what I do:
function grep () {
tput rmam;
command grep "$#";
tput smam;
}
In my .bash_profile, I override grep so that it automatically runs tput rmam before and tput smam after, which disabled wrapping and then re-enables it.
ag can also take the regex trick, if you prefer it:
ag --column -o ".{0,20}error.{0,20}"

Resources