Getting numbers from a string with grep

Getting numbers from a string with grep - grep

I came with another simple question...
I got a string with a substring in the format xx:xx:xx where the x's are numbers. I want to extract that substring including the ":" symbol, so my output would be "xx:xx:xx".
I think it can be done with a grep -Eo [0-9], but im not sure of the syntax... Any help?

echo "substring in the format 12:43:37 where the x's are numbers" |
grep -o '[0-9:]*'
Output:
12:43:37
If you have other numbers in the input string you can be more specific:
grep -o '[0-9]*:[0-9]*:[0-9]*'
even:
grep -o '[0-9][0-9]:[0-9][0-9]:[0-9][0-9]'

Related

How to make "grep" output complete word that includes the match?

I would like grep to print out all complete words that include the match.
Google did not help me. Here what I tried:
cat file.txt
21676 Mm.24685 NM_009346 ENSMUSG00000055320
20349 Mm.134093 NM_011348 ENSMUSG00000063531
12456 Mm.134000 NM_011228 GM415666
grep -o "ENSMUS" file.txt
ENSMUS
ENSMUS
Desired output:
ENSMUSG00000055320
ENSMUSG00000063531
Thanks for your help!

You may use:
grep -wo "ENSMUS[^[:blank:]]*" file.txt
ENSMUSG00000055320
ENSMUSG00000063531
Here [^[:blank:]]* will match 0 or more characters that are not whitespaces. -w will ensure full word matches.

To extract ENSEMBL mouse accession numbers without the version number:
grep -Po 'ENSMUS\w+' in_file
With the version number:
grep -Po 'ENSMUS\S+' in_file
Here,
\w+ : 1 or more word characters ([A-Za-z0-9_]).
\S+ : 1 or more non-whitespace characters (you can also be more restrictive and use [\w.]+, which is 1 or more word character or literal dot).
Here, GNU grep uses the following options:
-P : Use Perl regexes.
-o : Print the matches only (1 match per line), not the entire lines.
SEE ALSO:
grep manual
perlre - Perl regular expressions

How to find a string with specified length in a text file using findstr or grep command?

I want to find a string of length 8 which starts with the characters "alo", in a text file.
For findstr, I have tried the following command - findstr /R "\<alo" file.txt. This command searches for strings starting with "alo" but cannot search for strings of length 8. For grep, I don't know how to do it.

grep -woE 'alo.{5}' filename
-o is for printing only the match
-E is to use extended regex
-w will make the given expression match only whole words
The number inside the parenthesis specifies the number of character to match after the letters 'alo'

grep -E '^alo(.*){5}$' filename

Cutting a length of specific string with grep

Let's say we have a string "test123" in a text file.
How do we cut out "test12" only or let's say there is other garbage behind "test123" such as test123x19853 and we want to cut out "test123x"?
I tried with grep -a "test123.\{1,4\}" testasd.txt and so on, but just can't get it right.
I also looked for example, but never found what I'm looking for.

expr:
kent$ x="test123x19853"
kent$ echo $(expr "$x" : '\(test.\{1,4\}\)')
test123x

What you need is -o which print out matched things only:
$ echo "test123x19853"|grep -o "test.\{1,4\}"
test123x
$ echo "test123x19853"|grep -oP "test.{1,4}"
test123x
-o, --only-matching show only the part of a line matching PATTERN

If you are ok with awkthen try following(not this will look for continuous occurrences of alphabets and then continuous occurrences of digits, didn't limit it to 4 or 5).
echo "test123x19853" | awk 'match($0,/[a-zA-Z]+[0-9]+/){print substr($0,RSTART,RLENGTH)}'
In case you want to look for only 1 to 4 digits after 1st continuous occurrence of alphabets then try following(my awk is old version so using --re-interval you could remove it in case you have latest version of ittoo).
echo "test123x19853" | awk --re-interval 'match($0,/[a-zA-Z]+[0-9]{1,4}/){print substr($0,RSTART,RLENGTH)}'

grep Everything after a string

I need to only grep the md5 hash
this is the hash
MD5 (mt.pm) = adcddd9492c707642d2bcffbfc67b7a6
it needs to look like this
adcddd9492c707642d2bcffbfc67b7a6
or to do the reverse
crapb0c63a3cb776502fe03706b2fd540439 /home/mta.pm"
and only get the hash
now clue how to
any Help

To grep, do the following (this will not work in all grep implementations):
grep -o '[a-z0-9]*$'
or you can use sed:
sed 's/.*= *\([a-z0-9]*\)$/\1/'

Try this (GNU grep):
grep -oP '.* \K.*$'
Or better :
grep -o '[[:xdigit:]]\{32\}$'
Or with bash :
read -a arr <<< 'MD5 (mt.pm) = adcddd9492c707642d2bcffbfc67b7a6'
echo ${arr[-1]}
With \{32\} it's much stronger. md5 is always 32 hexadecimal characters, see http://en.wikipedia.org/wiki/MD5
[[:xdigit:]] is a POSIX class regex, that means to match only hex chars.
FINALLY
If you want to match a 32 hex characters long in a string :
grep -o '[[:xdigit:]]\{32\}'
will do the trick.

Use grep to report back only line numbers

I have a file that possibly contains bad formatting (in this case, the occurrence of the pattern \\backslash). I would like to use grep to return only the line numbers where this occurs (as in, the match was here, go to line # x and fix it).
However, there doesn't seem to be a way to print the line number (grep -n) and not the match or line itself.
I can use another regex to extract the line numbers, but I want to make sure grep cannot do it by itself. grep -no comes closest, I think, but still displays the match.

try:
grep -n "text to find" file.ext | cut -f1 -d:

If you're open to using AWK:
awk '/textstring/ {print FNR}' textfile
In this case, FNR is the line number. AWK is a great tool when you're looking at grep|cut, or any time you're looking to take grep output and manipulate it.

All of these answers require grep to generate the entire matching lines, then pipe it to another program. If your lines are very long, it might be more efficient to use just sed to output the line numbers:
sed -n '/pattern/=' filename

Bash version
lineno=$(grep -n "pattern" filename)
lineno=${lineno%%:*}

I recommend the answers with sed and awk for just getting the line number, rather than using grep to get the entire matching line and then removing that from the output with cut or another tool. For completeness, you can also use Perl:
perl -nE 'say $. if /pattern/' filename
or Ruby:
ruby -ne 'puts $. if /pattern/' filename

using only grep:
grep -n "text to find" file.ext | grep -Po '^[^:]+'

You're going to want the second field after the colon, not the first.
grep -n "text to find" file.txt | cut -f2 -d:

To count the number of lines matched the pattern:
grep -n "Pattern" in_file.ext | wc -l
To extract matched pattern
sed -n '/pattern/p' file.est
To display line numbers on which pattern was matched
grep -n "pattern" file.ext | cut -f1 -d:

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Getting numbers from a string with grep - grep

echo "substring in the format 12:43:37 where the x's are numbers" | grep -o '[0-9:]' Output: 12:43:37 If you have other numbers in the input string you can be more specific: grep -o '[0-9]:[0-9]:[0-9]' even: grep -o '[0-9][0-9]:[0-9][0-9]:[0-9][0-9]'

Related

How to make "grep" output complete word that includes the match?

How to find a string with specified length in a text file using findstr or grep command?

Cutting a length of specific string with grep

grep Everything after a string

Use grep to report back only line numbers

Categories

Resources

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Getting numbers from a string with grep - grep

echo "substring in the format 12:43:37 where the x's are numbers" | grep -o '[0-9:]*' Output: 12:43:37 If you have other numbers in the input string you can be more specific: grep -o '[0-9]*:[0-9]*:[0-9]*' even: grep -o '[0-9][0-9]:[0-9][0-9]:[0-9][0-9]'

Related

How to make "grep" output complete word that includes the match?

How to find a string with specified length in a text file using findstr or grep command?

Cutting a length of specific string with grep

grep Everything after a string

Use grep to report back only line numbers

Categories

Resources

echo "substring in the format 12:43:37 where the x's are numbers" | grep -o '[0-9:]' Output: 12:43:37 If you have other numbers in the input string you can be more specific: grep -o '[0-9]:[0-9]:[0-9]' even: grep -o '[0-9][0-9]:[0-9][0-9]:[0-9][0-9]'