Illegal variable name error when using grep -v '^$' [duplicate] - grep

This question already has an answer here:
why is a double-quoted awk command substitution failing in csh
(1 answer)
Closed 4 years ago.
I get an error Illegal variable name when I use this line of code:
set users = "` last | sort | tr -s '\t' ' ' | grep '[0,2][0-4]:[0-5][0-9] -' | grep -v '^$' | grep -v '[2][0-1]:[0-5][0-9] -' `"
But it works fine when I use this code:
set users = "` last | sort | tr -s '\t' ' ' | grep '[0,2][0-4]:[0-5][0-9] -' | grep -v '[2][0-1]:[0-5][0-9] -' `"
The code should store people who logged in between 22:00 and 05:00 (excluding 05:00) into a variable named users. It should also remove any empty lines which are in the output. This is what I'm trying to do in the first code, but it gives me the aforementioned error.

I don't know how to explain it, but it is one of these typical CSH pitfalls.
A <dollar> ($) between <double-quotes> (") (independently if they are in between <back-ticks> (`) and <single-quotes> (') are always concidered to be variable names. So if the word following the <dollar> is not a valid variable name, the thing starts to complain. Example:
$ grep "foo$" file.txt
Illegal variable name.
This is exactly what your problem is. You wrote something similar too
$ set var = "`grep -v '^$' file.txt`"
and even though the <dollar> is between <single-quotes> which are in-between <back-ticks> for command substitution which is again between <double-quotes> to retain the blanks and tabs of the command substitution, it just does not matter! There is no hope! You used <double-quotes> with all good intentions, but it just blew up in your face! Resistance is futile, your <dollar> will be assimilated to resemble a variable, even when it does not! CSH just does not care! You just want to cry! You cannot even escape it!
If you make use of last from util-linux, you might be interested in the flags --since and --until (see here and here). Otherwise you might use the following command line:
set users="`last | awk '/(2[2-3]|0[0-4]):.. [-s]/'`"
This will match all lines where the user logged in between 22 en 05 (excl) and is potentially still logged in.
As a general note, I would suggest switching from CSH to BASH for many reasons. Some of them are mentioned here and here.

Related

How can i make grep show a line ignoring the words i want?

I am trying to use grep with the pwd command.
So, if i enter pwd, it shows me something like:
/home/hrq/my-project/
But, for purposes of a script i am making, i need to use it with grep, so it only prints what is after hrq/, so i need to hide my home folder always (the /home/hrq/) excerpt, and show only what is onwards (like, in this case, only my-project).
Is it possible?
I tried something like
pwd | grep -ov 'home', since i saw that the "-v" flag would be equivalent to the NOT operator, and combine it with the "-o" only matching flag. But it didn't work.
Given:
$ pwd
/home/foo/tmp
$ echo "$PWD"
/home/foo/tmp
Depending on what it is you really want to do, either of these is probably what you really should be using rather than trying to use grep:
$ basename "$PWD"
tmp
$ echo "${PWD#/home/foo/}"
tmp
Use grep -Po 'hrq/\K.*', for example:
grep -Po 'hrq/\K.*' <<< '/home/hrq/my-project/'
my-project/
Here, grep uses the following options:
-P : Use Perl regexes.
-o : Print the matches only (1 match per line), not the entire lines.
\K : Cause the regex engine to "keep" everything it had matched prior to the \K and not include it in the match. Specifically, ignore the preceding part of the regex when printing the match.
SEE ALSO:
grep manual
perlre - Perl regular expressions

grep exact match in colon delimited string

I am trying to extract the version from a colon delimited list. The value I want is for foo, however there is another value in the list called foo-bar causing both values to return. This is what I am doing:
LIST="foo:1.0.0
foo-bar:1.0.1"
VERSION=$(echo "${LIST}" | grep "\bfoo\b" | cut -s -d':' -f2)
echo -e "VERSION: ${VERSION}"
Output:
VERSION: 1.0.0
1.0.1
NOTE: Sometimes LIST will look like the following, which should result in version being empty (this is expected).
LIST="foo
foo-bar:1.0.1"
You may use a PCRE regex enabled with -P option and use a (?!-) negative lookahead that will fail the match in case there is a - after a whole word foo:
grep -P "\bfoo\b(?!-)"
See online demo
This regex should extract any number and optional dots at the end of each line. If the line ends with a colon, then it won't match.
grep -oE '(([[:digit:]]+[.]*)+)$

grep "?" does not match valid matches

I want to match tags in files (with optional brackets) ... easy one would think ... the regex is something like ^\[?MyTag\]?. But ... Grep doesn't like it. None of the lines that would be valid matches are actually matched.
The interesting part is: if I replace the ? with a * (so zero to infinite matches, not zero or one) it matches everything like it should, but really that would mean the feature is broken and I don't believe that.
Any input?
Using grep (GNU grep) 2.22 on Windows.
PS: so grep is like this ...
grep -e "^\[?MyTag\]?" file.txt
and my test file is like this
[MyTag] hello
NotMyTag ugly
[NotMyTag] dumb
MyTag world
which obviously should result in 1st and 4th line showing but shows nothing.
First off, ? is not supported in vanilla grep, so you need to use the -E flag to enable extended regex. You can easily verify this by running grep '?' <<< 'a' and grep -E '?' <<< 'a'. Only the latter will match. -e just explicitly indicates what your regex is. It is not the same as -E.
Your initial command works fine if you change the -e to upper case:
grep -E '^\[?MyTag\]?'
Example:
$ grep -E '^\[?MyTag\]?' <<< '[MyTag] hello
> NotMyTag ugly
> [NotMyTag] dumb
> MyTag world'
Output:
[MyTag] hello
MyTag world
Credit goes to the answers of this question on SuperUser.
? is not part of the basic regular expressions, which grep supports. GNU grep supports them as an extension, but you have to escape them:
$ grep '^\[\?MyTag\]\?' file.txt
[MyTag] hello
MyTag world
Or, as pointed out, use grep -E to enable extended regular expressions.
For GNU grep, the only difference between grep and grep -E, i.e., using basic and extended regular expressions, is what you have to escape and what not.
Basic regular expressions
Capture groups and quantifying have to be escaped: \( \) and \{ \}
Zero or one (?), one or more (+) and alternation (|) are not part of BRE, but supported by GNU grep as an extension (but need to be escaped: \? \+ \|)
Extended regular expressions
Capture groups and quantifying don't have to be escaped: ( ) and { }
?, + and | are supported and don't need be be escaped

How to truncate long matching lines returned by grep or ack

I want to run ack or grep on HTML files that often have very long lines. I don't want to see very long lines that wrap repeatedly. But I do want to see just that portion of a long line that surrounds a string that matches the regular expression. How can I get this using any combination of Unix tools?
You could use the grep options -oE, possibly in combination with changing your pattern to ".{0,10}<original pattern>.{0,10}" in order to see some context around it:
-o, --only-matching
Show only the part of a matching line that matches PATTERN.
-E, --extended-regexp
Interpret pattern as an extended regular expression (i.e., force grep to behave as egrep).
For example (from #Renaud's comment):
grep -oE ".{0,10}mysearchstring.{0,10}" myfile.txt
Alternatively, you could try -c:
-c, --count
Suppress normal output; instead print a count of matching lines
for each input file. With the -v, --invert-match option (see
below), count non-matching lines.
Pipe your results thru cut. I'm also considering adding a --cut switch so you could say --cut=80 and only get 80 columns.
You could use less as a pager for ack and chop long lines: ack --pager="less -S" This retains the long line but leaves it on one line instead of wrapping. To see more of the line, scroll left/right in less with the arrow keys.
I have the following alias setup for ack to do this:
alias ick='ack -i --pager="less -R -S"'
grep -oE ".\{0,10\}error.\{0,10\}" mylogfile.txt
In the unusual situation where you cannot use -E, use lowercase -e instead.
Explanation:
cut -c 1-100
gets characters from 1 to 100.
The Silver Searcher (ag) supports its natively via the --width NUM option. It will replace the rest of longer lines by [...].
Example (truncate after 120 characters):
$ ag --width 120 '#patternfly'
...
1:{"version":3,"file":"react-icons.js","sources":["../../node_modules/#patternfly/ [...]
In ack3, a similar feature is planned but currently not implemented.
Taken from: http://www.topbug.net/blog/2016/08/18/truncate-long-matching-lines-of-grep-a-solution-that-preserves-color/
The suggested approach ".{0,10}<original pattern>.{0,10}" is perfectly good except for that the highlighting color is often messed up. I've created a script with a similar output but the color is also preserved:
#!/bin/bash
# Usage:
# grepl PATTERN [FILE]
# how many characters around the searching keyword should be shown?
context_length=10
# What is the length of the control character for the color before and after the
# matching string?
# This is mostly determined by the environmental variable GREP_COLORS.
control_length_before=$(($(echo a | grep --color=always a | cut -d a -f '1' | wc -c)-1))
control_length_after=$(($(echo a | grep --color=always a | cut -d a -f '2' | wc -c)-1))
grep -E --color=always "$1" $2 |
grep --color=none -oE \
".{0,$(($control_length_before + $context_length))}$1.{0,$(($control_length_after + $context_length))}"
Assuming the script is saved as grepl, then grepl pattern file_with_long_lines should display the matching lines but with only 10 characters around the matching string.
I put the following into my .bashrc:
grepl() {
$(which grep) --color=always $# | less -RS
}
You can then use grepl on the command line with any arguments that are available for grep. Use the arrow keys to see the tail of longer lines. Use q to quit.
Explanation:
grepl() {: Define a new function that will be available in every (new) bash console.
$(which grep): Get the full path of grep. (Ubuntu defines an alias for grep that is equivalent to grep --color=auto. We don't want that alias but the original grep.)
--color=always: Colorize the output. (--color=auto from the alias won't work since grep detects that the output is put into a pipe and won't color it then.)
$#: Put all arguments given to the grepl function here.
less: Display the lines using less
-R: Show colors
S: Don't break long lines
Here's what I do:
function grep () {
tput rmam;
command grep "$#";
tput smam;
}
In my .bash_profile, I override grep so that it automatically runs tput rmam before and tput smam after, which disabled wrapping and then re-enables it.
ag can also take the regex trick, if you prefer it:
ag --column -o ".{0,20}error.{0,20}"

Can grep show only words that match search pattern?

Is there a way to make grep output "words" from files that match the search expression?
If I want to find all the instances of, say, "th" in a number of files, I can do:
grep "th" *
but the output will be something like (bold is by me);
some-text-file : the cat sat on the mat
some-other-text-file : the quick brown fox
yet-another-text-file : i hope this explains it thoroughly
What I want it to output, using the same search, is:
the
the
the
this
thoroughly
Is this possible using grep? Or using another combination of tools?
Try grep -o:
grep -oh "\w*th\w*" *
Edit: matching from Phil's comment.
From the docs:
-h, --no-filename
Suppress the prefixing of file names on output. This is the default
when there is only one file (or only standard input) to search.
-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.
Cross distribution safe answer (including windows minGW?)
grep -h "[[:alpha:]]*th[[:alpha:]]*" 'filename' | tr ' ' '\n' | grep -h "[[:alpha:]]*th[[:alpha:]]*"
If you're using older versions of grep (like 2.4.2) which do not include the -o option, then use the above. Else use the simpler to maintain version below.
Linux cross distribution safe answer
grep -oh "[[:alpha:]]*th[[:alpha:]]*" 'filename'
To summarize: -oh outputs the regular expression matches to the file content (and not its filename), just like how you would expect a regular expression to work in vim/etc... What word or regular expression you would be searching for then, is up to you! As long as you remain with POSIX and not perl syntax (refer below)
More from the manual for grep
-o Print each match, but only the match, not the entire line.
-h Never print filename headers (i.e. filenames) with output lines.
-w The expression is searched for as a word (as if surrounded by
`[[:<:]]' and `[[:>:]]';
The reason why the original answer does not work for everyone
The usage of \w varies from platform to platform, as it's an extended "perl" syntax. As such, those grep installations that are limited to work with POSIX character classes use [[:alpha:]] and not its perl equivalent of \w. See the Wikipedia page on regular expression for more
Ultimately, the POSIX answer above will be a lot more reliable regardless of platform (being the original) for grep
As for support of grep without -o option, the first grep outputs the relevant lines, the tr splits the spaces to new lines, the final grep filters only for the respective lines.
(PS: I know most platforms by now would have been patched for \w.... but there are always those that lag behind)
Credit for the "-o" workaround from #AdamRosenfield answer
It's more simple than you think. Try this:
egrep -wo 'th.[a-z]*' filename.txt #### (Case Sensitive)
egrep -iwo 'th.[a-z]*' filename.txt ### (Case Insensitive)
Where,
egrep: Grep will work with extended regular expression.
w : Matches only word/words instead of substring.
o : Display only matched pattern instead of whole line.
i : If u want to ignore case sensitivity.
You could translate spaces to newlines and then grep, e.g.:
cat * | tr ' ' '\n' | grep th
Just awk, no need combination of tools.
# awk '{for(i=1;i<=NF;i++){if($i~/^th/){print $i}}}' file
the
the
the
this
thoroughly
grep command for only matching and perl
grep -o -P 'th.*? ' filename
I was unsatisfied with awk's hard to remember syntax but I liked the idea of using one utility to do this.
It seems like ack (or ack-grep if you use Ubuntu) can do this easily:
# ack-grep -ho "\bth.*?\b" *
the
the
the
this
thoroughly
If you omit the -h flag you get:
# ack-grep -o "\bth.*?\b" *
some-other-text-file
1:the
some-text-file
1:the
the
yet-another-text-file
1:this
thoroughly
As a bonus, you can use the --output flag to do this for more complex searches with just about the easiest syntax I've found:
# echo "bug: 1, id: 5, time: 12/27/2010" > test-file
# ack-grep -ho "bug: (\d*), id: (\d*), time: (.*)" --output '$1, $2, $3' test-file
1, 5, 12/27/2010
cat *-text-file | grep -Eio "th[a-z]+"
You can also try pcregrep. There is also a -w option in grep, but in some cases it doesn't work as expected.
From Wikipedia:
cat fruitlist.txt
apple
apples
pineapple
apple-
apple-fruit
fruit-apple
grep -w apple fruitlist.txt
apple
apple-
apple-fruit
fruit-apple
I had a similar problem, looking for grep/pattern regex and the "matched pattern found" as output.
At the end I used egrep (same regex on grep -e or -G didn't give me the same result of egrep) with the option -o
so, I think that could be something similar to (I'm NOT a regex Master) :
egrep -o "the*|this{1}|thoroughly{1}" filename
To search all the words with start with "icon-" the following command works perfect. I am using Ack here which is similar to grep but with better options and nice formatting.
ack -oh --type=html "\w*icon-\w*" | sort | uniq
You could pipe your grep output into Perl like this:
grep "th" * | perl -n -e'while(/(\w*th\w*)/g) {print "$1\n"}'
grep --color -o -E "Begin.{0,}?End" file.txt
? - Match as few as possible until the End
Tested on macos terminal
$ grep -w
Excerpt from grep man page:
-w: Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character.
ripgrep
Here are the example using ripgrep:
rg -o "(\w+)?th(\w+)?"
It'll match all words matching th.

Resources