I came with another simple question...
I got a string with a substring in the format xx:xx:xx where the x's are numbers. I want to extract that substring including the ":" symbol, so my output would be "xx:xx:xx".
I think it can be done with a grep -Eo [0-9], but im not sure of the syntax... Any help?
echo "substring in the format 12:43:37 where the x's are numbers" |
grep -o '[0-9:]*'
Output:
12:43:37
If you have other numbers in the input string you can be more specific:
grep -o '[0-9]*:[0-9]*:[0-9]*'
even:
grep -o '[0-9][0-9]:[0-9][0-9]:[0-9][0-9]'
Related
I would like grep to print out all complete words that include the match.
Google did not help me. Here what I tried:
cat file.txt
21676 Mm.24685 NM_009346 ENSMUSG00000055320
20349 Mm.134093 NM_011348 ENSMUSG00000063531
12456 Mm.134000 NM_011228 GM415666
grep -o "ENSMUS" file.txt
ENSMUS
ENSMUS
Desired output:
ENSMUSG00000055320
ENSMUSG00000063531
Thanks for your help!
You may use:
grep -wo "ENSMUS[^[:blank:]]*" file.txt
ENSMUSG00000055320
ENSMUSG00000063531
Here [^[:blank:]]* will match 0 or more characters that are not whitespaces. -w will ensure full word matches.
To extract ENSEMBL mouse accession numbers without the version number:
grep -Po 'ENSMUS\w+' in_file
With the version number:
grep -Po 'ENSMUS\S+' in_file
Here,
\w+ : 1 or more word characters ([A-Za-z0-9_]).
\S+ : 1 or more non-whitespace characters (you can also be more restrictive and use [\w.]+, which is 1 or more word character or literal dot).
Here, GNU grep uses the following options:
-P : Use Perl regexes.
-o : Print the matches only (1 match per line), not the entire lines.
SEE ALSO:
grep manual
perlre - Perl regular expressions
I want to find a string of length 8 which starts with the characters "alo", in a text file.
For findstr, I have tried the following command - findstr /R "\<alo" file.txt. This command searches for strings starting with "alo" but cannot search for strings of length 8. For grep, I don't know how to do it.
grep -woE 'alo.{5}' filename
-o is for printing only the match
-E is to use extended regex
-w will make the given expression match only whole words
The number inside the parenthesis specifies the number of character to match after the letters 'alo'
grep -E '^alo(.*){5}$' filename
Let's say we have a string "test123" in a text file.
How do we cut out "test12" only or let's say there is other garbage behind "test123" such as test123x19853 and we want to cut out "test123x"?
I tried with grep -a "test123.\{1,4\}" testasd.txt and so on, but just can't get it right.
I also looked for example, but never found what I'm looking for.
expr:
kent$ x="test123x19853"
kent$ echo $(expr "$x" : '\(test.\{1,4\}\)')
test123x
What you need is -o which print out matched things only:
$ echo "test123x19853"|grep -o "test.\{1,4\}"
test123x
$ echo "test123x19853"|grep -oP "test.{1,4}"
test123x
-o, --only-matching show only the part of a line matching PATTERN
If you are ok with awkthen try following(not this will look for continuous occurrences of alphabets and then continuous occurrences of digits, didn't limit it to 4 or 5).
echo "test123x19853" | awk 'match($0,/[a-zA-Z]+[0-9]+/){print substr($0,RSTART,RLENGTH)}'
In case you want to look for only 1 to 4 digits after 1st continuous occurrence of alphabets then try following(my awk is old version so using --re-interval you could remove it in case you have latest version of ittoo).
echo "test123x19853" | awk --re-interval 'match($0,/[a-zA-Z]+[0-9]{1,4}/){print substr($0,RSTART,RLENGTH)}'
I need to only grep the md5 hash
this is the hash
MD5 (mt.pm) = adcddd9492c707642d2bcffbfc67b7a6
it needs to look like this
adcddd9492c707642d2bcffbfc67b7a6
or to do the reverse
crapb0c63a3cb776502fe03706b2fd540439 /home/mta.pm"
and only get the hash
now clue how to
any Help
To grep, do the following (this will not work in all grep implementations):
grep -o '[a-z0-9]*$'
or you can use sed:
sed 's/.*= *\([a-z0-9]*\)$/\1/'
Try this (GNU grep):
grep -oP '.* \K.*$'
Or better :
grep -o '[[:xdigit:]]\{32\}$'
Or with bash :
read -a arr <<< 'MD5 (mt.pm) = adcddd9492c707642d2bcffbfc67b7a6'
echo ${arr[-1]}
With \{32\} it's much stronger. md5 is always 32 hexadecimal characters, see http://en.wikipedia.org/wiki/MD5
[[:xdigit:]] is a POSIX class regex, that means to match only hex chars.
FINALLY
If you want to match a 32 hex characters long in a string :
grep -o '[[:xdigit:]]\{32\}'
will do the trick.
I have a file that possibly contains bad formatting (in this case, the occurrence of the pattern \\backslash). I would like to use grep to return only the line numbers where this occurs (as in, the match was here, go to line # x and fix it).
However, there doesn't seem to be a way to print the line number (grep -n) and not the match or line itself.
I can use another regex to extract the line numbers, but I want to make sure grep cannot do it by itself. grep -no comes closest, I think, but still displays the match.
try:
grep -n "text to find" file.ext | cut -f1 -d:
If you're open to using AWK:
awk '/textstring/ {print FNR}' textfile
In this case, FNR is the line number. AWK is a great tool when you're looking at grep|cut, or any time you're looking to take grep output and manipulate it.
All of these answers require grep to generate the entire matching lines, then pipe it to another program. If your lines are very long, it might be more efficient to use just sed to output the line numbers:
sed -n '/pattern/=' filename
Bash version
lineno=$(grep -n "pattern" filename)
lineno=${lineno%%:*}
I recommend the answers with sed and awk for just getting the line number, rather than using grep to get the entire matching line and then removing that from the output with cut or another tool. For completeness, you can also use Perl:
perl -nE 'say $. if /pattern/' filename
or Ruby:
ruby -ne 'puts $. if /pattern/' filename
using only grep:
grep -n "text to find" file.ext | grep -Po '^[^:]+'
You're going to want the second field after the colon, not the first.
grep -n "text to find" file.txt | cut -f2 -d:
To count the number of lines matched the pattern:
grep -n "Pattern" in_file.ext | wc -l
To extract matched pattern
sed -n '/pattern/p' file.est
To display line numbers on which pattern was matched
grep -n "pattern" file.ext | cut -f1 -d: