How to use grep for such case? - grep

grep function not working in the correct manner.
Here is my code:
echo -n "Title: " # prompt user for title
read title # get input from keyboard
echo -n "Author: " # prompt user for author
read author # get input from keyboard
if grep -q -i -w $title BookDB.txt # check for title in BookDB.txt
then # if duplicate exist
clear # clear screen
echo "Error! Book already exists!" # prompt user about duplicated entry
echo " " # newline
continue # display main menu
else # if duplicate absent
echo -n "Price: " # prompt user for price
read price # get input from keyboard
echo -n "Qty Available: " # prompt user for qty available
read qtyAvail # get input from keyboard
echo -n "Qty Sold: " # prompt user for qty sold
read qtySold # get input from keyboard
E.g
"Lord of the ring" is in the BookDB.txt
If I want to add new book, will check if it exists a not.
However, if I want to add "Lord of the Stone", it says that already exist.
Advise please.

Always quote your shell variables:
if grep -q -i -w "$title" BookDB.txt
Otherwise, shell interprets each part separated by whitespace as a different argument. For grep this means search "Lord" in files named of, the, Stone, and BookDB.txt.

Minimal solution:
#!/bin/bash
echo -n "Title: " # prompt user for title
read title # get input from keyboard
if grep -q -i -w "$title" BookDB.txt ; then
echo "Found"
else
echo "do something else"
fi
Try with and without the quoted $title to see the effect.
Also note this will partial match, so "Lord" will find "Lord of the Rings" and "Lord Of The Flies". But you have bigger problems ahead of you...

Related

show filename with matching word from grep only

I am trying to find which words happened in logfiles plus show the logfilename for anything that matches following pattern:
'BA10\|BA20\|BA21\|BA30\|BA31\|BA00'
so if file dummylogfile.log contains BA10002 I would like to get a result such as:
dummylogfile.log:BA10002
it is totally fine if the logfile shows up twice for duplicate matches.
the closest I got is:
for f in $(find . -name '*.err' -exec grep -l 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' {} \+);do printf $f;printf ':';grep -o 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' $f;done
but this gives things like:
./register-05-14-11-53-59_24154.err:BA10
BA10
./register_mdw_files_2020-05-14-11-54-32_24429.err:BA10
BA10
./process_tables.2020-05-18-11-18-09_11428.err:BA30
./status_load_2020-05-18-11-35-31_9185.err:BA30
so,
1) there are empty lines with only the second match and
2) the full match (e.g., BA10004) is not shown.
thanks for the help
There are a couple of options you can pass to grep:
-H: This will report the filename and the match
-o: only show the match, not the full line
-w: The match must represent a full word (string build from [A-Za-z0-9_])
If we look at your regex, you use BA01, this will match only BA01 which can appear anywhere in the text, also mid word. If you want the regex to match a full word, it should read BA01[[:alnum:]_]* which adds any sequence of word-constituent characters (equivalent to [A-Za-z0-9_]). You can test this with
$ echo "foo BA01234 barBA012" | grep -Ho "BA01"
(standard input):BA01
(standard input):BA01
$ echo "foo BA01234 barBA012" | grep -How "BA01"
$ echo "foo BA01234 barBA012" | grep -How "BA01[[:alnum:]_]*"
(standard input):BA01234
So your grep should look like
grep -How "\('BA10\|BA20\|BA21\|BA30\|BA31\|BA00'\)[[:alnum:]_]*" *.err
From your example it seems that all files are in one directory. So the following works right away:
grep -l 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' *.err
If the files are in different directories:
find . -name '*.err' -print | xargs -I {} grep 'BA10\|BA20\|BA21\|BA30\|BA31\|BA00' {} /dev/null
Explanation: the addition of /dev/null to the filename {} forces grep to report the matching filename

grep : to look up a valid entry in a log

Hello everyone I'm having an issue with this script. I've just begun work on it and it is supposed to look for entries previously generated by another script I made.
The gist of the thing is that the log has entries like:
makefile_1786878:/home/user/project
the format is filename_inode:/originaldirectory/
and this script is supposed to take a parameter and look for its exact match in the log
if [ $# -eq 0 ]
then
echo "No filename has been provided. Please enter a filename to restore!"
exit 1
fi
echo You have entered $1
echo Looking for $1 in the list of items deleted by safe_rm...
restoredfile=$(grep ^$1 $HOME/.restore.info)
echo $restoredfile
The problem I'm having is, if the user entered "mak" or "make" or "makefi" as a parameter it will incorrectly look up this entry
I want it to specifically get the exact match for this, I don't know how to force grep to do that
Try either one of these and see if it'll work for you:
grep -w "makefile"
grep "\<makefile\>"
If that work, then just change your grep to:
grep either one of those with the $1 parameter inside.

Can I use grep to show only the matched line, and not the file it appeared in?

I sometimes want to grep for a function to see examples of how it is used in context, eg. what sort of parameters it is called with. When I am doing this, the name of the file the match appears in becomes useless clutter. Is there any way to instruct grep to not include it? (Or a grep alternative that solves the same problem?)
You can tell grep not to indicate the filename in the output with the option -h:
-h, --no-filename
Suppress the prefixing of file names on output. This is the
default when there is only one file (or only standard input) to
search.
Test
$ echo "hello" > f1
$ echo "hello man" > f2
$ grep "hello" f*
f1:hello
f2:hello man
$ grep -h "hello" f*
hello
hello man

Using While and Grep together in Bourne Shell Script

I'm building a student database in Bourne Shell Script, and this is literally the very first time I've ever even seen code written like this, so I'm terribly out of my element. I need to make it so that when the user inputs a course, the program checks the user input vs a database of courses I already have, and if the course doesn't exist, promps the user to input a new course. This is what I'm trying:
echo "course-1: \c"
read course1
while [[ grep -i "$course1" course3.dat == 1]]
do
echo "course does not exist"
echo "course-1: \c"
read course1
done
echo "course-2: \c"
read course2
while [[ grep -i "$course2" course3.dat == 1]]
do
echo "course does not exist"
echo "course-2: \c"
read course2
done
But I'm getting errors "conditional binary operator expected" and "syntax error near `-i' ". I've been trying to google answers but I'm not coming up with anything useful. So I was wondering if any of you could help me? Thanks so much.
[[ is a shortcut for /bin/test, which isn't what you want. (Here's a man page about it.)
Try this instead:
while ! grep -i "$course1" course3.dat
Or
until grep -i "$course1" course3.dat
The grep expression evaluates to true when grep is successful (i.e. matching lines), and the ! inverts that. Until has built in the opposite semantics from while.
[[ and [ are "test", which is what you want.
However, different shells have different syntaxes; ksh or bash would interpret "[[" okay, but Bourne shell (normally /bin/sh) would not.

How can I remove duplicates (deduplicate) a mbox format email mailbox?

I've got a mbox mailbox containing duplicate copies of messages, which differ only in their "X-Evolution:" header.
I want to remove the duplicate ones, in as quick and simple a way as possible. It seems like this would have been written already, but I haven't found it, although I've looked at the Python mailbox module, the various perl mbox parsers, formail, and so forth.
Does anyone have any suggestions?
This a small script, which I used for it:
#!/bin/bash
IDCACHE=$(mktemp -p /tmp)
formail -D $((1024*1024*10)) ${IDCACHE} -s
rm ${IDCACHE}
The mailbox needs to be piped through it, and in the meantime it will be deduplicated.
-D $((1024*1024*10)) sets a 10 Mebibyte cache, which is more than 10x the amount needed to deduplicate an entire year of my mail. YMMV, so adjust it accordingly. Setting it too high will cause some performance loss, setting it to low will let it slip duplicates.
formail is part of the procmail utility bundle, mktemp is part of coreutils.
I didn't look at formail (part of procmail) in enough detail. It does have such such an option, as mentioned in places like: http://hints.macworld.com/comment.php?mode=view&cid=115683 and http://us.generation-nt.com/answer/deleting-duplicate-mail-messages-help-172481881.html
'formail -D' and 'reformail -D' can only process one email per execution. Each mail needs to be separated from mbox first before being processed. I use reformail from maildrop instead since it's still in active development.
remove old idcache, tmpmail, nmbox
run dedup.sh .
nmbox is the output with duplicate messages removed.
dedup.sh
#! /bin/sh
# $1 = mbox, thunderbird mailbox
# wmbox.sh is called for each mail.
cat $1 | reformail -s ./wmbox.sh
wmbox.sh
#! /bin/sh
# stdin: a email
# called by dedup.sh
TM=tmpmail
if [ -f $TM ] ; then
echo error!
exit 1
fi
cat > $TM
# mbox format, each mail end with a blank line
echo "" >> $TM
cat $TM | reformail -D 99999999 idcache
# if this mail isn't a dup (reformail return 1 if message-id is not found)
if [ $? != 0 ]; then
# each mail shall have a message-id
if grep -q -i '^message-id:' $TM; then
cat tmpmail >> nmbox
fi
fi
rm $TM

Resources