I am trying to parse items out of a file I have. I cant figure out how to do this with grep
here is the syntax
<FQDN>Compname.dom.domain.com</FQDN>
<FQDN>Compname1.dom.domain.com</FQDN>
<FQDN>Compname2.dom.domain.com</FQDN>
I want to spit out just the bits between the > and the <
can anyone assist?
Thanks
grep can do some text extraction. however not sure if this is what you want:
grep -Po "(?<=>)[^<]*"
test
kent$ echo "<FQDN>Compname.dom.domain.com</FQDN>
dquote>
dquote> <FQDN>Compname1.dom.domain.com</FQDN>
dquote>
dquote> <FQDN>Compname2.dom.domain.com</FQDN>"|grep -Po "(?<=>)[^<]*"
Compname.dom.domain.com
Compname1.dom.domain.com
Compname2.dom.domain.com
Grep isn't what you are looking for.
Try sed with a regular expression : http://unixhelp.ed.ac.uk/CGI/man-cgi?sed
You can do it like you want with grep :
grep -oP '<FQDN>\K[^<]+' FILE
Output:
Compname.dom.domain.com
Compname1.dom.domain.com
Compname2.dom.domain.com
As others have said, grep is not the ideal tool for this. However:
$ echo '<FQDN>Compname.dom.domain.com</FQDN>' | egrep -io '[a-z]+\.[^<]+'
Compname.dom.domain.com
Remember that grep's purpose is to MATCH things. The -o option shows you what it matched. In order to make regex conditions that are not part of the expression that is returned, you'd need to use lookahead or lookbehind, which most command-line grep does not support because it's part of PCRE rather than ERE.
$ echo '<FQDN>Compname.dom.domain.com</FQDN>' | grep -Po '(?<=>)[^<]+'
Compname.dom.domain.com
The -P option will work in most Linux environments, but not in *BSD or OSX or Solaris, etc.
Related
I'm trying to find lines with words not preceded by double colons (::).
Example
void myClass::doMything() // I don't want this line
myObj->doMyThing() // I want this line
My goal is to get the lines where some methods are used, but not where the methods are defined.
I try with this command :
grep --color=always -rwna "methodName" --include=*.cpp | grep -v "::methodName"
but it doesn't work : it keeps extracting also lines containing
::methodName
I've also tried by writing
grep --color=always -rwna "methodName" --include=*.cpp | grep -v "\:\:methodName"
egrep --color=always -rwna "methodName" --include=*.cpp | egrep -v "\:\:methodName"
but neither works.
What should I do ?
Although grep is probably most common used tool among all linux CLI tools and is used by every1 and everywhere... still doesnt mean its perfect. The thing you are trying to achieve is not achievable with basic grep's regex - you need python/perl regex here.
As a workaround (I assume you are trying to only find line where method is invoked) you can try:
grep -Eno "(::)?methodName" your_input_files | grep -v "::methodName"
-n to prints line number and I believe it will give convenience to you
-o to prints only matched part, but I use it here to split output - to have each match in separate line (if you have 5x methodName in line of code you will have 5 lines in grep's output)
(::)? to find distinguish if its declaration or invokation of methodName, we will need it when 2nd grep comes to play...
grep -v ...and here it comes, to get rid of what you dont want
I guess you want to use maaaaany times so you can even try to make a function into your .bashrc
find_invocations () {
# below example goes through current dir, but you can improve it :)
grep --color=yes -Eno "(::)?$1" * 2>/dev/null | grep -v "::$1"
}
in above function you might go risky and use $1.* instead of $1 but an unpleasant case is if you have both methodname and ::methodName in same line AFAIR my C++ lessons (ages ago - anno 2010) methodName::methodName is a constructor...
...sorry for bad english
I've finally managed to make it work.
I've tried linux_beginner's suggestion:
grep -Eno '(::)?myMethodName' path/to/one/of/the/files.cpp | grep -v '::myMethodName'
with a single file and this works. (I found I prefer not using the o option, because I also want to se how it's used).
In this search I need anyway to use multiple files. So I've also tried to include more files :
grep -Eno '(::)?myMethodName' --include=*.cpp | grep -v '::myMethodName'
but in this case it remains like stuck in the search (maybe it triggers some slow scripting ? perl or python ?).
I've checked RavinderSingh13's command. Taken in a single instance, it can capture the lines with double colon(and only them, correctly), both on single file or in multiple files :
grep -rna '::myMethodName' path/to/one/of/the/file.cpp
grep -rna '::myMethodName' --include=*.cpp
but there must not be the -w switch, so the following:
grep -rna '::myMethodName' path/to/one/of/the/file.cpp
grep -rna '::myMethodName' --include=*.cpp
don't get any result.
RavinderSingh13's suggestion put inside the pipelining doesn't manage to filter out the double colon lines (my original goal), either with single or multiple files :
grep -rwna 'myMethodName' path/to/one/of/the/files.cpp | grep -v '::[[:alpha:]]+'
-> extracts both myMethodName and ::myMethodName from the chosen file
grep -rwna 'myMethodName' --include=*.cpp | grep -v '::[[:alpha:]]+'
-> extracts both myMethodName and ::myMethodName from all the cpp files
Now, how I could solve:
usually, when I concatenate grep commands I also add to the first of them the switch --color=always, which preserves results coloring also across the piping of multiple commands.
But that... was the culprit !
i.e., doing
grep --color=always -rwna 'myMethodName' --include=*.cpp | grep -v '::myMethodName'
preserves the color in results, but sadly fails to exclude lines containing ::myMethodName, while
grep -rwna 'myMethodName' --include=*.cpp | grep -v '::myMethodName'
gives colorless but correct results (manages to filter out double column lines).
The distribution on which I've experimented these codes and behaviours is Ubuntu 20.04.1 LTS.
Grep version : grep (GNU grep) 3.4
Thanks everybody for the interest.
I have a bunch of strings that I have to fetch the 'port_num' from -
"76 : client=new; tags=circ, LINK; port_num=switch01; far_port=Gi1/0"
The word might be in a different place in the string and it might be a different length, but it always says 'port_num=' before it and ';' after it...
I only want this bit- 'switch01'
Currently I use-
| grep -Eo 'port_num=.+' | cut -d"=" -f2 | cut -d";" -f1'
But there has got to be a better way
You can try grep -oP '(?<=port_num=).+(?=;)', if you run this:
echo "76 : client=new; tags=circ, LINK; port_num=switch01; far_port=Gi1/0" \
| grep -oP '(?<=port_num=).+(?=;)'
result will be:
switch01
Updated answer: grep -oP '(?<=port_num=)[^;]+(?=;)'
This is what I would use:
... | grep -E 'port_num=.+' | sed 's/^.*port_num=\([^;]*\).*$/\1/'
This works with or without the -o on grep, and the availability of -P will depend on the version of grep you have. (e.g., my grep does not have it). I'm not saying the other answers that rely on -P aren't any good -- they look fine to me. But grep -P will be less portable.
IMHO, piping grep with sed allows each utility to do what it specializes in -- grep is for selecting lines, sed is for modifying lines.
This can be done in a simple sed command:
s="76 : client=new; tags=circ, LINK; port_num=switch01; far_port=Gi1/0"
sed 's/.*port_num=\([^;]*\);.*/\1/' <<< "$s"
switch01
... | grep -Po 'port_num.+(?=;)'
This uses grep's Perl Compatible Regular Expression (PCRE) syntax. The (?=;) is a look-ahead assertion which looks for a match with ";" but doesn't include it in the matched output.
This produces:
port_num=switch01
As #Vladimir Kovpak noted, if you want to exclude the "port_num=" string from this output, add a look-behind assertion:
... | grep -Po '(?<=port_num).+(?=;)'
I want to do sth like:
grep -A 10 'myString' && NOT 'anotherString'
If I didn't need -A 10 I know I could pipe greps and use -v, but it would not work like that in this case. So I would do sth like that:
grep "myString" | grep -v "anotherString"
Any ideas?
Try to invert and place the grep with the -A 10 argument in the end. Like this:
grep -v 'anotherString' | grep -A 10 'myString'
The only POSIX supported options for grep are -EFcefilnqsvx so be aware that the -A option may not be present on all implementations of grep. And even on GNU grep there is no option to specify "match OR match" and there is no regex that can emulate this as all it can do is provide additional matches, but can not withhold them. Essentially the only way to accomplish this with grep alone is to use a pipe.
I need some help with a grep command (in the Bash).
In my source files, I want to list all unique parameters of a function. Background: I want to search through all files, to see, which permissions ([perm("abc")] are used.
Example.txt:
if (x) perm("this"); else perm("that");
perm("what");
I'd like to have my grep output:
this
that
what
If I do my grep with this search expression
perm\(\"(.*?)\"\)
I'll get perm("this), perm("that"), etc. but I'd like to have just the permissions: this and that and what.
How can I do that?
Use a look-behind:
$ grep -Po '(?<=perm\(")[^"]*' file
this
that
what
This looks for all the text occurring after perm(" and until another " is found.
Note -P is used to allow this behaviour (it is a Perl regex) and -o to just print the matched item, instead of the whole line.
Here is a gnu awk version (due to multiple characters in RS)
awk -v RS='perm\\("' -F\" 'NR>1 {print $1}' file
this
that
what
I would like some advice on how to exclude a word in a line using grep but still keep the line?
So I have tried:
grep -v '1.942134' results.tbl | egrep '*.fits' results.tbl
to try to list all the string with extension .fits but exclude "1.942134" in the sentence but it still returns the full lines.
Any advice?
Or you can use awk
awk '/\.fits/ && !/1\.942134/` results.tbl
PS you should escape the . in both sed and awk or else it will mean just any character.
You should pipe to sed. Sed has lots of abilities, some of them more complicated than others, but one of its best is regexp substitutions.
grep '\.fits$' | sed 's/1.942134//'