Using grep with word boundary for a numeric range - grep

Hi I'm trying to use grep for a specific range of values which are tail ended by a word.
eg "000RJ" - "015RJ" from a larger range.
What would be the most effective way to do this please?

You can use grep with numerical range expansions like,
grep -w '0[0-1][0-5]KT' fileA.txt
to expand numbers from range 000-015.

Related

How to grep a still unknown specific word in a matched row

I have a file made of some rows and the one I am interested in is like this one:
free energy TOTEN = -96.86706464 eV
So with grep I can find the row I need and assign to a variable the value of the row with:
E=$(grep "free energy" OUTCAR_$i)
Now, how do I do if want to assign to E a specific word present in the matched line obtained by using grep, the numeric value in this case? Please note that value I want to grep is the unknown I am looking, but it is always present at the same position in the row!
Thank you
With GNU grep, you may use a PCRE regex solution:
E=$(grep -oP 'free energy.* \K-?[0-9][0-9.]*' "OUTCAR_$i")
See the online demo
With GNU sed, you may extract the negative value from a line:
E=$(sed -n '/free energy/{s/.* \(-\{0,1\}[0-9][0-9.]*\).*/\1/p}' "OUTCAR_$i")
See the online demo.
If the number of non-whtespace chunks is a fixed value extract the fifth field if the line contains free energy:
E=$(awk '$0 ~ /free energy/{print $5}' "OUTCAR_$i")
See this online demo

How to use grep command to filter a log file for a specific keyword within particular timestamp?

So
grep "xyz" file.log
will print all the lines having xyz as a key word and
grep "01/APR/2014:16:3[5-9]" file,log
will print lines within that time range.How to use both the feature i.e a key word filter within a time range?
Just pipe your two greps together:
grep “xyz” file.log | grep “01/APR/2014:16:3[5-9]”
The first grep will parse out all the lines with xyz, the second grep will winnow that list down by the date given. Depending on your data set, reversing the greps could be faster.

grep for matching 1 to 2 digits in a sequence of numbers

I have below numbers in a file
44700101
44700201
44700301
44700401
44700501
44700601
44700701
44700801
44700901
44701001
want to fetch the above numbers whose 5th and 6th digits are greater than 5 USING GREP WILDCARDS.
something like "grep ....[6-10].. file" should yield below
44700601
44700701
44700801
44700901
44701001
Any help will be appreciated. Thanks
gawk (GNU awk) approach:
awk '{split($0,a,"")}int(a[5]a[6])>5' file
gawk has the ability for FS and for the third argument to split() to be null strings
split($0,a,"") - splits the numeric string into separate numbers (filling array a)
int(a[5]a[6])>5 - print the line if integer representation of the 5th and 6th numbers is greater than 5
grep approach:
grep '^[0-9]\{4\}\([1-9]\|0[6-9]\).*' file
The output (for both approaches):
44700601
44700701
44700801
44700901
44701001
Just use awk:
$ awk 'substr($0,5,2)+0 > 5' file
44700601
44700701
44700801
44700901
44701001

Only output values within a certain range

I run a command that produce lots of lines in my terminal - the lines are floats.
I only want certain numbers to be output as a line in my terminal.
I know that I can pipe the results to egrep:
| egrep "(369|433|375|368)"
if I want only certain values to appear. But is it possible to only have lines that have a value within ± 50 of 350 (for example) to appear?
grep matches against string tokens, so you have to either:
figure out the right string match for the number range you want (e.g., for 300-400, you might do something like grep -E [34].., with appropriate additional context added to the expression and a number of additional .s equal to your floating-point precision)
convert the number strings to actual numbers in whatever programming language you prefer to use and filter them that way
I'd strongly encourage you to take the second option.
I would go with awk here:
./yourProgram | awk '$1>250 && $1<350'
e.g.
echo -e "12.3\n342.678\n287.99999" | awk '$1>250 && $1<350'
342.678
287.99999

GREP How do I search for words that contain specific letters (one or more times)?

I'm using the operating systems dictionary file to scan. I'm creating a java program to allow a user to enter any concoction of letters to find words that contain those letters. How would I do this using grep commands?
To find words that contain only the given letters:
grep -v '[^aeiou]' wordlist
The above filters out the lines in wordlist that don't contain any characters except for those listed. It's sort of using a double negative to get what you want. Another way to do this would be:
grep '^[aeiou]+$' wordlist
which searches the whole line for a sequence of one or more of the selected letters.
To find words that contain all of the given letters is a bit more lengthy, because there may be other letters in between the ones we want:
cat wordlist | grep a | grep e | grep i | grep o | grep u
(Yes, there is a useless use of cat above, but the symmetry is better this way.)
You can use a single grep to solve the last problem in Greg's answer, provided your grep supports PCRE. (Based on this excellent answer, boiled down a bit)
grep -P "(?=.*a)(?=.*e)(?=.*i)(?=.*o)(?=.*u)" wordlist
The positive lookahead means it will match anything with an "a" anywhere, and an "e" anywhere, and.... etc etc.

Resources