grep show all lines, not just matches, set exit status - grep

I'm piping some output of a command to egrep, which I'm using to make sure a particular failure string doesn't appear in.
The command itself, unfortunately, won't return a proper non-zero exit status on failure, that's why I'm doing this.
command | egrep -i -v "badpattern"
This works as far as giving me the exit code I want (1 if badpattern appears in the output, 0 otherwise), BUT, it'll only output lines that don't match the pattern (as the -v switch was designed to do). For my needs, those lines are the most interesting lines.
Is there a way to have grep just blindly pass through all lines it gets as input, and just give me the exit code as appropriate?
If not, I was thinking I could just use perl -ne "print; exit 1 if /badpattern/". I use -n rather than -p because -p won't print the offending line (since it prints after running the one-liner). So, I use -n and call print myself, which at least gives me the first offending line, but then output (and execution) stops there, so I'd have to do something like
perl -e '$code = 0; while (<>) { print; $code = 1 if /badpattern/; } exit $code'
which does the whole deal, but is a bit much, is there a simple command line switch for grep that will just do what I'm looking for?

Actually, your perl idea is not bad. Try:
perl -pe 'END { exit $status } $status=1 if /badpattern/;'
I bet this is at least as fast as the other options being suggested.

$ tee /dev/tty < ~/.bashrc | grep -q spam && echo spam || echo no spam

How about doing a redirect to /dev/null, hence removing all lines, but you still get the exit code?
$ grep spam .bashrc > /dev/null
$ echo $?
1
$ grep alias .bashrc > /dev/null
$ echo $?
0
Or you can simply use the -q switch
-q, --quiet, --silent
Quiet; do not write anything to standard output. Exit
immediately with zero status if any match is found, even if an
error was detected. Also see the -s or --no-messages option.
(-q is specified by POSIX.)

Related

using grep command to get spectfic word [LINUX]

I have a test.txt file with links for example:
google.com?test=
google.com?hello=
and this code
xargs -0 -n1 -a FUZZvul.txt -d '\n' -P 20 -I % curl -ks1L '%/?=DarkLotus' | grep -a 'DarkLotus'
When I type a specific word, such as DarkLotus, in the terminal, it checks the links in the file and it brings me the word which is reflected in the links i provided in the test file
There is no problem here, the problem is that I have many links, and when the result appears in the terminal, I do not know which site reflected the DarkLotus word.
How can i do it?
Try -n option. It shows the line number of file with the matched line.
Best Regards,
Haridas.
I'm not sure what you are up to there, but can you invert it? grep by default prints matching lines. The problem here is you are piping the input from the stdout of the previous commands into grep, and that can lack context at grep. Since you have a file to work with:
$ grep 'DarkLotus' FUZZvul.txt
If your intention is to also follow the link then it might be easier to write a bash script:
#!/bin/bash
for line in `grep 'DarkLotus FUZZvul.txt`
do
link=# extract link from line
echo ${link}
curl -ks1L ${link}
done
Then you could make your script accept user input:
#/bin/bash
word="${0}"
for line in `grep ${word} FUZZvul.txt`
...
and then
$ my_link_getter "DarkLotus"
https://google?somearg=DarkLotus
...
And then you could make the txt file a parameter.
etc.

Error handling in Makefile

What is wrong with this Makefile?
I want to compile some lua files to check if there are any unexpected globals defined. I'm doing this by grepping the output of luac -l and then ignoring known globals.
So for a given lua file everything is OK if grep doesn't find anything, having ignored known lua globals.
As grep's return status code is 0 if it does find something and 1 if it doesn't I want to force an error if the status code from the grep is 0 and allow everything to continue if it isn't.
The Makefile is like this
IGNORE_GLOBALS = "dofile\|string\|tostring\|tonumber\|math\|io\|type\|os\|table\|pairs\|next\|require"
all: $(patsubst src/common/%.lua, %.lua, $(wildcard src/common/*.lua))
%.lua:
#echo check $#
#luac -l src/common/$# | grep '.ETGLOBAL' | grep -v $(IGNORE_GLOBALS) && $(error Unexpected globals in $#) || echo "No unexpected globals in $#"
But when I run it immediately quits on the first file, which happens to have no unexpected globals with
Makefile:10: *** Unexpected globals in chat-cmd.lua. Stop.
line 10 is surprisingly the line before, i.e.
#echo check $#
Interestingly if I replace $(error ...) with echo ..., as in
#luac -l src/common/$# | grep '.ETGLOBAL' | grep -v $(IGNORE_GLOBALS) && echo "Unexpected globals in $#" || echo "No unexpected globals in $#"
it behaves as intended.
As #siffiejoe says in the comment. $(error) is make function and is run when the recipe as a whole is being evaluated (you can think of it like hoisting if that helps).
So as soon as the recipe needs to be run (and the first line executed) the $(error) call is evaluated.
Note: In the shell X && Y || Z is not a ternary operation. Z will be run if X succeeds and Y fails as well as when X fails. This doesn't matter here as echo cannot really fail but in general is worth paying attention to.
You want to use something more like #! lua ... | grep -v $(IGNORE_GLOBALS) || { echo 'Unexpected globals in $#'; exit 1; } there. This doesn't spit out the "everything's ok" message but removes the X && Y || Z ternary issue.
If you wanted to keep that message the simplest thing to do would be to move to an actual if statement.

Best way to search the path in shell

I've got a small script called "onewhich". Its purpose is to behave like which, except that it will only give the FIRST occurrence of any executables specified as options, as found in the order they'd appear in the path.
So for example, if my path is /opt/bin:/usr/bin:/bin, and I have both /opt/bin/runme and /usr/bin/runme, then the command onewhich runme would return /opt/bin/runme.
But if I also have a /usr/bin/doit, then the command onewhich doit runme would return /usr/bin/doit instead.
The idea is to walk through the path, check for each executable specified, and if it exists, show it and exit.
Here's the script so far.
#!/bin/sh
for what in "$#"; do
for loc in `echo "${PATH}" | awk -vRS=: 1`; do
if [ -f "${loc}/${what}" ]; then
echo "${loc}/${what}"
exit 0
fi
done
done
exit 1
The problem is, I want to be better about PATH directories with special characters. Every second shell question here on StackOverflow talks about how bad it is to parse paths with tools like awk and sed. There's even a bash faq entry about it. (Proviso: I'm not using bash for this, but the recommendation is still valid.)
So I tried rewriting the script to separate paths in a pipe, like this"
#!/bin/sh
for what in "$#"; do
echo "${PATH}" | awk -vRS=: 1 | while read loc ; do
if [ -f "${loc}/${what}" ]; then
echo "${loc}/${what}"
exit 0
fi
done
done
exit 1
I'm not sure if this gives me any real advantage (since $loc is still inside quotes), but it also doesn't work because for some reason, the exit 0 seems to be ignored. Or ... it exits something (the sub-shell with the while loop that terminates the pipe, maybe), but the script exits with a value of 1 every time.
What's a better way to step through directories in ${PATH} without the risk that special characters will confuse things?
Alternately, am I reinventing the wheel? Is there maybe a way to do this that's built in to existing shell tools?
This needs to run in both Linux and FreeBSD, which is why I'm writing it in Bourne instead of bash.
Thanks.
This doesn't directly answer your question, but does eliminate the need to parse PATH at all:
onewhich () {
for what in "$#"; do
which "$what" 2>/dev/null && break
done
}
This just calls which on each command on the input list until it finds a match.
To parse PATH, you can simply set `IFS=':'.
if [ "${IFS:-x}" = "${IFS-x}" ]; then
# Only preserve the value of IFS if it is currently set
OLDIFS=$IFS
fi
IFS=":"
for f in $PATH; do # Do not quote $PATH, to allow word splitting
echo $f
done
if [ "${OLDIFS:-x}" = "${OLDIFS-x}" ]; then
IFS=$OLDIFS
fi
The above will fail if any of the directories in PATH actually contain colons.
Your first method looks to me as if it should work. In practical terms, if it's really the $PATH you'll be searching, it's unlikely you'll have spaces and newlines embedded in directories there. If you do, it's probably time to refactor.
But still, I don't think you're at risk from the possibility of bad names clobbering your loop, since you're wrapping variables in quotes. At worst, I suspect you might miss the odd valid executable, but I can't see how the script would generate errors. (I don't see how the script would miss valid executables, and I haven't tested - I'm just saying I don't see problems at first glance.)
As for your second question, about the loop, I think you've hit the nail on the head. When you run a pipe like this | that | while condition; do things; done, the while loop runs in its own shell at the end of the pipe. Exiting that shell may terminate the actions of the pipe, but that only brings you back to the parent shell, which has its own thread of execution that terminates with exit 1.
As for a better way to do this, I would consider which.
#!/bin/sh
for what in "$#"; do
which "$what"
done | head -1
And if you really want the exit values as well:
#!/bin/sh
for what in "$#"; do
which "$what" && exit 0
done
exit 1
The second might even be fewer resources, as it doesn't have to open a file handle and pipe through head.
You can also split your path using IFS. For example, if you wanted to wrap your loops the other way around, you could do this:
#!/bin/sh
IFS=":"
for loc in $PATH; do
for what in "$#"; do
if [ -x "$loc"/"$what" ]; then
echo "$loc"/"$what"
exit 0
fi
done
done
exit 1
Note that under normal circumstances, you might want to save the old value of $IFS, but you seem to be doing things in a stand-alone script, so the "new" value gets thrown out when the script exits.
All the above code is untested. YMMV.
Another way to get around the need to parse PATH at all is to run the builtin type command in new shell with a stripped environment (i. e. there simply are no functions or aliases to look up; cf. env -i sh -c 'type cmd 2>/dev/null).
# using `cmd` instead of $(cmd) for portability
onewhich() {
ec=0 # exit code
for cmd in "$#"; do
command -p env -i PATH="$PATH" sh -c '
export LC_ALL=C LANG=C
cmd="$1"
path="`type "$cmd" 2>/dev/null`"
if [ X"$path" = "X" ]; then
printf "%s\n" "error: command \"${cmd}\" not found in PATH" 1>&2
exit 1
else
case "$path" in
*\ /*)
path="/${path#*/}"
printf "%s\n" "$path";;
*)
printf "%s\n" "error: no disk file: $path" 1>&2
exit 1;;
esac
exit 0
fi
' _ "$cmd"
[ $? != 0 ] && ec=1
done
[ $ec != 0 ] && return 1
}
onewhich awk ls sed
onewhich builtin
onewhich if
Since which on success returns two full command paths if two commands are specified as arguments, exit 0 in the first onewhich script above aborts the program prematurely. In addition, if two commands are specified as arguments to which, the exit code of which is set to 1 even if only one command lookup failed (cf. which awk sedxyz ls; echo $?). To mimic this behaviour of the which command it is necessary to toggle on/off two variables (cnt and nomatches below).
onewhich() (
IFS=":"
nomatches=0
for cmd in "$#"; do
cnt=0
for loc in $PATH ; do
if [ $cnt = 0 ] && [ -x "$loc"/"$cmd" ]; then
echo "$loc"/"$cmd"
cnt=1
fi
done
[ $cnt = 0 ] && nomatches=1
done
[ $nomatches = 1 ] && exit 1 || exit 0 # exit 1: at least one cmd was not in PATH
)
onewhich awk ls sed
onewhich awk lsxyz sed
onewhich builtin
onewhich if

parse maven output in real time using sed

I am trying to parse my mvn verify output to only show lines with INFO tags. Please note that maven outputs line to stdout in real time and not by batch. I do not think that it is a problem with maven.
At first I tried to do it with grep:
$ mvn verify | grep INFO
but didn't seem to output lines in real time, as I understand grep buffers its lines before outputting, so I have to wait a few seconds between each flush and then I have tens of lines being printed at the same time, not very convenient. Then I thought I would try with sed.
According to this link, the following command:
sed -n '/PATTERN/p' file
// is equivalent to
grep PATTERN file
and according to this link, the -l option should force sed to flush its output buffer after every newline. So now I am using this command:
$ mvn verify | sed -ln -e '/INFO/p'
but I'm still getting the same result as before, I get a ton of output flushed every 30s or so and I don't know what I've done wrong. Can someone point me in the right direction please?
Try this, if your grep supports it:
mvn verify | grep --line-buffered INFO
If you're doing this in a terminal and still seeing buffered results, it would probably be something earlier than grep doing the buffering, but I'm not familiar with mvn. (And, yes, the -l option to sed should have done the same thing, so the problem may be upstream.)
try this line:
mvn verify | while read line; do echo $line|grep INFO; done
I found what was the problem, I was using a script to colorise maven output (see here) and in fact it was that script that was buffering the output down the pipe. I forgot about it as I was using it as an alias, I guess this is a good lesson, I won't alias as easily in the future. Anyway here is the fix, I changed -e to -le in the last line of the sed call:
mvn $# | sed -e "s/\(\[INFO\]\ \-.*\)/${TEXT_BLUE}${BOLD}\1/g" \
-e "s/\(\[INFO\]\ \[.*\)/${RESET_FORMATTING}${BOLD}\1${RESET_FORMATTING}/g" \
-e "s/\(\[INFO\]\ BUILD SUCCESSFUL\)/${BOLD}${TEXT_GREEN}\1${RESET_FORMATTING}/g" \
-e "s/\(\[WARNING\].*\)/${BOLD}${TEXT_YELLOW}\1${RESET_FORMATTING}/g" \
-e "s/\(\[ERROR\].*\)/${BOLD}${TEXT_RED}\1${RESET_FORMATTING}/g" \
-le "s/Tests run: \([^,]*\), Failures: \([^,]*\), Errors: \([^,]*\), Skipped: \([^,]*\)/${BOLD}${TEXT_GREEN}Tests run: \1${RESET_FORMATTING}, Failures: ${BOLD}${TEXT_RED}\2${RESET_FORMATTING}, Errors: ${BOLD}${TEXT_RED}\3${RESET_FORMATTING}, Skipped: ${BOLD}${TEXT_YELLOW}\4${RESET_FORMATTING}/g"
In effect this is telling sed to flush its output at every new line, which is what I wanted. I am sorry I didn't find another workaround that is more generic. I tried playing around with empty (see man page) and script but none of these solutions worked for me.

How can I remove duplicates (deduplicate) a mbox format email mailbox?

I've got a mbox mailbox containing duplicate copies of messages, which differ only in their "X-Evolution:" header.
I want to remove the duplicate ones, in as quick and simple a way as possible. It seems like this would have been written already, but I haven't found it, although I've looked at the Python mailbox module, the various perl mbox parsers, formail, and so forth.
Does anyone have any suggestions?
This a small script, which I used for it:
#!/bin/bash
IDCACHE=$(mktemp -p /tmp)
formail -D $((1024*1024*10)) ${IDCACHE} -s
rm ${IDCACHE}
The mailbox needs to be piped through it, and in the meantime it will be deduplicated.
-D $((1024*1024*10)) sets a 10 Mebibyte cache, which is more than 10x the amount needed to deduplicate an entire year of my mail. YMMV, so adjust it accordingly. Setting it too high will cause some performance loss, setting it to low will let it slip duplicates.
formail is part of the procmail utility bundle, mktemp is part of coreutils.
I didn't look at formail (part of procmail) in enough detail. It does have such such an option, as mentioned in places like: http://hints.macworld.com/comment.php?mode=view&cid=115683 and http://us.generation-nt.com/answer/deleting-duplicate-mail-messages-help-172481881.html
'formail -D' and 'reformail -D' can only process one email per execution. Each mail needs to be separated from mbox first before being processed. I use reformail from maildrop instead since it's still in active development.
remove old idcache, tmpmail, nmbox
run dedup.sh .
nmbox is the output with duplicate messages removed.
dedup.sh
#! /bin/sh
# $1 = mbox, thunderbird mailbox
# wmbox.sh is called for each mail.
cat $1 | reformail -s ./wmbox.sh
wmbox.sh
#! /bin/sh
# stdin: a email
# called by dedup.sh
TM=tmpmail
if [ -f $TM ] ; then
echo error!
exit 1
fi
cat > $TM
# mbox format, each mail end with a blank line
echo "" >> $TM
cat $TM | reformail -D 99999999 idcache
# if this mail isn't a dup (reformail return 1 if message-id is not found)
if [ $? != 0 ]; then
# each mail shall have a message-id
if grep -q -i '^message-id:' $TM; then
cat tmpmail >> nmbox
fi
fi
rm $TM

Resources