How can I grep for the Trademark Symbol ™? - grep

How I can generate a list of files that contain the trademark symbol (™ ALT+0153)?
I believe this may be possible by using perl regular expressions.

This works for me:
echo "How I can generate a list of files that contain the trademark symbol (™ ALT+0153)?" | \
grep -P "\x6b"
Output:
How I can generate a list of files that contain the trademark symbol (™ ALT+0153)?
So the general command is
grep -I -l -r -P "\x6b" *
0153 (oct) == 6B (hex)

Related

Does Mercurial have a template to capture output of "hg grep"?

I was searching for a change that included "foreach" so I used this Mercurial command:
$ hg grep -r "user(mjh) & public() & date(-30)" --diff -i foreach
and it does return the hits where "foreach" was added and removed.
However, I'd like to know the actual commit hashes too. If I add a template:
$ hg grep ... -T '{date|shortdate}\n{node|short}\n{desc|firstline}\n\n'
then I get the commit hash and description as expected, but then I don't see the changed files listed.
Is there a template to capture the output of hg grep? The {files} template lists the files associated with a commit, but that's not the actual grep output. Is there an iterable template keyword available for the grep results?
Please, re-read carefully hg help grep -v (-v is important option), note the following part (new and unexpected for me also)
The following keywords are supported in addition to the common
template
keywords and functions. See also 'hg help templates'.
change String. Character denoting insertion "+" or removal "-".
Available if "--diff" is specified.
lineno Integer. Line number of the match.
path String. Repository-absolute path of the file.
texts List of text chunks.
After it you'll be able to repeat (so-so, because some details will differ slightly) default output of grep in you template
>hg grep --diff -i -r 1166 to_try
>hg grep --diff -i -r 1166 -T "{path}:{rev}:{change}:{texts}\n" to_try
hggit/compat.py:1166:-: for args in parameters_to_try:
hggit/compat.py:1166:+: for (args, kwargs) in parameters_to_try:
and after replacing {rev} by {node|short}
>hg grep --diff -i -r 1166 -T "{path}:{node|short}:{change}:{texts}\n" to_try
hggit/compat.py:f6cef55e6aeb:-: for args in parameters_to_try:
hggit/compat.py:f6cef55e6aeb:+: for (args, kwargs) in parameters_to_try:

grep -v no longer excluding pattern after migration

One of our shared hosting sites got moved recently. New server is Red Hat 4.8.5-36. The other binaries' versions are grep (GNU grep) 2.20 and find (GNU findutils) 4.5.11
This cron job had previously functioned fine for at least 6 years and gave us a list of updated files which did not match logs, cache etc.
find /home/example/example.com/public_html/ -mmin -12 \
| grep -v 'error_log|logs|cache'
After the move the -v seems to be ineffectual and we get results like
/home/example/example.com/public_html/products/cache/ssu/pc/d/5/c
The change in results occurred immediately after the move. Anyone have an idea why it is now broken? Additionally - how do I restore the filtered output?
If you like to exclude a group of words.
grep -v -e 'error_log' -e 'logs' -e 'cache' file
With awk you can do:
awk '!/error_log|logs|cache/' file
It will exclude all lines with these words.
grep -v 'error_log|logs|cache'
only excludes strings that contain literally error_log|logs|cache. To use alternation, use extended regular expressions:
grep -Ev 'error_log|logs|cache'
GNU grep supports alternation as an extension to Basic Regular Expressions, but | needs to be escaped, so this might work as well:
grep -v 'error_log\|logs\|cache'
However, grep isn't required in the first place, we can use (GNU) find to do all the work:
find /home/example/example.com/public_html/ -mmin -12 \
-not \( -name '*error_log*' -or -name '*logs*' -or -name '*cache*' \)
or, POSIX compliant:
find /home/example/example.com/public_html/ -mmin -12 \
\! \( -name '*error_log*' -o -name '*logs*' -o -name '*cache*' \)
or, if your find supports -regex (both GNU and BSD find do):
find /home/example/example.com/public_html/ -mmin -12 \
-not -regex '.*\(error_log\|logs\|cache\).*'

How do I 'grep -c' and avoid printing files with zero '0' count

The command 'grep -c blah *' lists all the files, like below.
% grep -c jill *
file1:1
file2:0
file3:0
file4:0
file5:0
file6:1
%
What I want is:
% grep -c jill * | grep -v ':0'
file1:1
file6:1
%
Instead of piping and grep'ing the output like above, is there a flag to suppress listing files with 0 counts?
SJ
How to grep nonzero counts:
grep -rIcH 'string' . | grep -v ':0$'
-r Recurse subdirectories.
-I Ignore binary files (thanks #tongpu, warlock).
-c Show count of matches. Annoyingly, includes 0-count files.
-H Show file name, even if only one file (thanks #CraigEstey).
'string' your string goes here.
. Start from the current directory.
| grep -v ':0$' Remove 0-count files. (thanks #LaurentiuRoescu)
(I realize the OP was excluding the pipe trick, but this is what works for me.)
Just use awk. e.g. with GNU awk for ENDFILE:
awk '/jill/{c++} ENDFILE{if (c) print FILENAME":"c; c=0}' *

Grep: Capture just number

I am trying to use grep to just capture a number in a string but I am having difficulty.
echo "There are <strong>54</strong> cities | grep -o "([0-9]+)"
How am I suppose to just have it return "54"? I have tried the above grep command and it doesn't work.
echo "You have <strong>54</strong>" | grep -o '[0-9]' seems to sort of work but it prints
5
4
instead of 54
Don't parse HTML with regex, use a proper parser :
$ echo "There are <strong>54</strong> cities " |
xmllint --html --xpath '//strong/text()' -
OUTPUT:
54
Check RegEx match open tags except XHTML self-contained tags
You need to use the "E" option for extended regex support (or use egrep). On my Mac OSX:
$ echo "There are <strong>54</strong> cities" | grep -Eo "[0-9]+"
54
You also need to think if there are going to be more than one occurrence of numbers in the line. What should be the behavior then?
EDIT 1: since you have now specified the requirement to be a number between <strong> tags, I would recommend using sed. On my platform, grep does not have the "P" option for perl style regexes. On my other box, the version of grep specifies that this is an experimental feature so I would go with sed in this case.
$ echo "There are <strong>54</strong> 12 cities" | sed -rn 's/^.*<strong>\s*([0-9]+)\s*<\/strong>.*$/\1/p'
54
Here "r" is for extended regex.
EDIT 2: If you have the "PCRE" option in your version of grep, you could also utilize the following with positive lookbehinds and lookaheads.
$ echo "There are <strong>54 </strong> 12 cities" | grep -o -P "(?<=<strong>)\s*([0-9]+)\s*(?=<\/strong>)"
54
RegEx Demo

grep: repetition-operator operand invalid

I have this regular express (?<=heads\/)(.*?)(?=\n) and you can see it working here
http://regexr.com?347dm
I need this regex to work in the grep command but I'm getting this error.
$ grep -Eio '(?<=heads\/)(.*?)(?=\n)' text.txt
grep: repetition-operator operand invalid
It works great in ack but I dont have ack on the machine I need to run this on.
ack text.txt -o --match '(?<=heads\/)(.*?)(?=\n)'
text.txt
74f3649af36984e1b784e46502fe318e91d29570 HEAD
06d4463ab47a6246e6bd94dc3b9267d59fc16c2e refs/heads/ARC
0597e13c22b6397a1b260951f9d064f668b26f08 refs/heads/LocationAge
e7e1ed942d15efb387c878b9d0335b37560c8807 refs/heads/feature/311-312-breaking-banner-updates
d0b2632b465702d840a358d0b192198ae505011c refs/heads/gulf-news
509173eafc6792739787787de0d23b0c804d4593 refs/heads/jbb-new-applicationdidfinishlaunching
1e7b03ce75b1a7ba47ff4fb5128bc0bf43a7393b refs/heads/locationdebug
74f3649af36984e1b784e46502fe318e91d29570 refs/heads/master
5d2ede384325877c24db7ba1ba0338dc7b7f84fb refs/heads/mixed-media
3f3b6a81dd3baea8744aec6b95c2fe4aaeb20ea3 refs/heads/post-onezero
4198a43aab2dfe72d7ae9e9e53fbb401fc9dac1f refs/heads/whitelabel
76741013b3b2200de29f53800d51dfd6dc7bac5e refs/tags/r10
fc53b1a05dad3072614fb397a228819a67615b82 refs/tags/r10^{}
afdcfd970c9387f6fda0390ef781c2776aa666c3 refs/tags/r11
grep does not support the (?<=...) or *? or (?=...) operators. See this table.
$ grep -Pio '(?<=heads\/)(.*?)(?=\n)' text.txt # P option instead of E
If you use GNU grep, you can use -P or --perl-regexp options.
In case you are using OS X, you need to install GNU grep.
$ brew install grep
Due to recent changes, to use GNU grep on macOS you either have to prepend the command with a 'g'
$ ggrep -Pio '(?<=heads\/)(.*?)(?=\n)' text.txt # P option instead of E
Or change the path name
Try this
grep -Eoh 'heads/.*' text.txt | grep -Eoh '/.*' | grep -Eoh '[a-zA-Z].*'

Resources