Grepping exact value but printing entire line - grep

I have a 2-column tab-separated file that I need to grep specific information from.
The file looks like this:
wych_hazel agt|plt
wytensin agt
x agt|com|qud
xanax agt
xtc agt
xylocaine agt
yellow_jacket agt|anm
I need to grep from the 2nd column, those lines that ONLY have the value agt.
The desired output would be this:
wytensin agt
xanax agt
xtc agt
xylocaine agt
I have tried:
grep -e 'agt' input
which gives me:
wych_hazel agt|plt
wytensin agt
x agt|com|qud
xanax agt
xtc agt
xylocaine agt
yellow_jacket agt|anm
then I have tried:
grep -oh 'agt' input
which gives me:
agt
agt
agt
agt
agt
agt
agt
What grep parameters should I introduce to arrive at my desired result?

This is a job for awk: just tell it to look for those lines in which the second field is exactly agt:
$ awk '$2=="agt"' file
wytensin agt
xanax agt
xtc agt
xylocaine agt
In grep, you can also check the presence of a space and then end of line around agt:
grep '\sagt$' file

If agt is always at the end of the line when it's on its own then you can just do:
grep agt$ input

Use word boundaries along with -E parameter.
grep -E '^[^ ]+\s+.*\sagt$' file

Related

how to avoid lookbehind assertion is not fixed length

I have a file that contains a version number that I need to output. This version number is apart of a string in this file, that looks something like this:
https://some-link:1234/path/to/file/name-of-file/1.2.345/name-of-file_CXP123456-1.2.345.jar"
I need to get the version number, which is 1.2.345.
This grep command works: grep -Po '(?<=/name-of-file_CXP123456-/)\d.\d.\d\d\d'. However, the CXP number changes and as such I thought I could do something like this: grep -Po '(?<=/name-of-file_*-/)\d.\d.\d\d\d' but that gives the following:
grep: lookbehind assertion is not fixed length
Is there anything I can add to the grep statement to avoid this?
Ultimately, this is part of a stage in Jenkins to get this version number. The sh command looks something like this:
VERSION = sh 'ssh -tt user#ip-address "cat dir/file*.content | grep -Po '(?<=/name-of-file_*-/)\d.\d.\d\d\d' 1>&2"'
You can use
grep -Po '/name-of-file_.*-\K\d+(?:\.\d+)+'
See the regex demo. Details:
/name-of-file_ - a literal text
.* - any zero or more chars other than line break chars as many as possible
- - a hyphen
\K - a match reset operator that omits all text matched so far from the memory buffer
\d+ - one or more digits
(?:\.\d+)+ - one or more sequences of a . and one or more digits.
You don't need lookbehind for this job. You also don't need PCREs, or grep at all.
#!/usr/bin/env bash
# ^^^^- bash, *not* sh
case $BASH_VERSION in '') echo "ERROR: bash required" >&2; exit 1;; esac
string="https://some-link:1234/path/to/file/name-of-file/1.2.345/name-of-file_CXP123456-1.2.345.jar"
regex='.*/name-of-file_CXP[[:digit:]]+-([[:digit:].]+)[.]jar'
if [[ $string =~ $regex ]]; then
echo "Version is ${BASH_REMATCH[1]}"
else
echo "No version found in $string"
fi
Maybe too long for a comment... It looks like the version number is the 2nd-to last field if you split on forward slash?
rev | cut -d/ -f 2 | rev
awk -F/ '{print $(NF-1)}'
perl -lanF/ -e 'print $F[-2]'
Or even something like: basename $(dirname $(cat filename))
For those that are really desperate there is another solution which requires you to pre-build your regex string.
It's not a solution I would recommend but if there is really no other way no one can stop you.
While even with this you won't have true dynamic look-behinds and it is still quite limited it is an option available to you.
The idea is to build the look-behind for each possible length you need it to be.
So for example only match if it's not preceded by a # (0 to a 100 characters look-behind).
reg='';
for ((i = 0 ; i <= 100 ; i++)); do reg+='(?<!#.{'"${i}"'})'; done;
reg+='someVariableName=.*?($|;|\\n)';
grep --perl-regexp "$reg" /usr/local/mgmsbox/msc/scripts/msc.cfg
This might not be the best example but it gets the idea across.
This solution has it's own pitfalls. For example you need to double escape \\ escape-sequences like \n and any character that should not be interpreted should be put in a single-quote string (or use printf).

How to print specific field that matches with a regex in awk

It is a simple question but I didn't find an answer...
I have a tab separated files with many rows and different number of fields in each row. Like this:
a1_j a2_f a3_f a10_g a8_t a2_e
a2_j
a6_h a8_o
a9_g
I just want to print those fields that start with a2, but not the whole line, just the matched fields.Like this:
a2_f
a2_e
a2_j
I tried with awk, with no success.
I would use grep to do this:
grep -o 'a2_[a-z]' file
The -o switch means that only matches are printed, each on a separate line.
You could loop through all the fields with a for loop, or use fmt to put all the fields on 1 line:
~$ fmt -w1 f
a1_j
a2_f
a3_f
a10_g
a8_t
a2_e
a2_j
a6_h
a8_o
a9_g
and then grep with grep or if you want to use awk:
~$ fmt -w1 f | awk '/a2/{print}'
a2_f
a2_e
a2_j
With GNU awk for multi-char RS and \s:
$ awk -v RS='\\s' '/^a2/' file
a2_f
a2_e
a2_j

Find parameters of a method with grep

I need some help with a grep command (in the Bash).
In my source files, I want to list all unique parameters of a function. Background: I want to search through all files, to see, which permissions ([perm("abc")] are used.
Example.txt:
if (x) perm("this"); else perm("that");
perm("what");
I'd like to have my grep output:
this
that
what
If I do my grep with this search expression
perm\(\"(.*?)\"\)
I'll get perm("this), perm("that"), etc. but I'd like to have just the permissions: this and that and what.
How can I do that?
Use a look-behind:
$ grep -Po '(?<=perm\(")[^"]*' file
this
that
what
This looks for all the text occurring after perm(" and until another " is found.
Note -P is used to allow this behaviour (it is a Perl regex) and -o to just print the matched item, instead of the whole line.
Here is a gnu awk version (due to multiple characters in RS)
awk -v RS='perm\\("' -F\" 'NR>1 {print $1}' file
this
that
what

How to get grep command output with a few lines?

Now I'm using next command, to get a list files, which contains "random_string":
grep -l random_string *.txt
I want to get not only filenames, but also a few lines after(and may be - before) input.
So I want something like this as output:
file1.txt
random_string
....
....
random_string
....
....
file2.txt
....
So, ok. I got it.
Just need too replace -l option for -A or -B or -C(it's for results as both of A and B).
grep --after-context=10 some_string *.py

basic grep

I have a large file where each line contains a substring such as ABC123. If I execute
grep ABC file.txt
or
grep ABC1 file.txt
I get those lines back as expected, but if I execute
grep ABC12 file.txt
grep fails to find the corresponding lines.
This seems pretty trivial functionality, but I'm not a heavy user of grep so perhaps I'm missing some gotcha.
Use something like
od -x -a < filename
to dump out the file contents in hex. That'll immediately show you if what you have in your file is what you expect. Which I suspect it isn't :-)
Note: od has lots of useful options to help you here. Too many to list, in fact.
Is there a chance your file contains some hidden character, such as 0x00 ?
This doesn't make sense. Are you sure the file contains "ABC123"?
You can verify this by running following command in a shell
echo "ABC123" | grep ABC12
If the lines contain ABC123, then "grep ABC12" should get them. Do you perhaps mean that you want to match several different strings, such as ABC1, ABC2 and ABC3? In that case you can try this:
grep -E 'ABC1|ABC2|ABC3'
I'm not sure what the problem is.. grep works exactly as it should.. For example, the contents of my test file:
$ cat file.txt
ABC
ABC1
ABC12
ABC123
..and grep'ing for ABC, ABC1, ABC12, ABC123:
$ grep ABC file.txt
ABC
ABC1
ABC12
ABC123
$ grep ABC1 file.txt
ABC1
ABC12
ABC123
$ grep ABC12 file.txt
ABC12
ABC123
$ grep ABC123 file.txt
ABC123
grep is basically a filter, any line containing the first argument (ABC, or ABC1 etc) will be displayed. If it doesn't contain the entire string, it will not be displayed

Resources