difference in behavior of "tr" command on "dash" (-) between busybox and Ubuntu/Raspbian/etc - tr

I have a function in a script which is used to validate that input strings don't contain any unacceptable characters. In this case, allowable characters are alpha, numeric, underscore, dash, period, and space.
#!/bin/sh
pattern="\_\-\. [a-zA-Z0-9]"
while [ 1 ]; do
echo "enter your test string"
read string
echo "result:"
echo "$string" | tr -cd "$pattern" | sed 's/\[//' | sed 's/\]//'
echo
echo
done
Testing on Raspbian (Raspberry Pi):
pi#raspberrypi:~ $ ./trtest.sh
enter your test string
dash-dash
result:
dash-dash
enter your test string
under_score
result:
under_score
Testing on an Onion board (OpenWRT/busybox):
root#Omega-FD22:~# ./trtest.sh
enter your test string
dash-dash
result:
dashdash <<<----- I'm not expecting this
enter your test string
under_score
result:
under_score
So,
#1 I am not sure why there is a difference in behavior between "tr" in these two cases, specifically on the "dash" character.
#2 If there's another way to do this, I'm open to it.
Thanks for any insights.
DL

FYI one of my diligent colleagues figured it out, so I am passing on his solution. If you move "\-" to the end of the pattern matching string, then it works in both environments. Somewhat beyond my ability to explain are the technical/philosophical underpinnings of this, but I'm glad it works.
Before:
pattern="\_\-\. [a-zA-Z0-9]"
After:
pattern="\_\. [a-zA-Z0-9]\-"

Related

Linux search string in a file with space

Im trying to figure out how to search a string in Linux i hope someone can help me out.
grep "Test\|Account" test.txt
The above command works if i only want to search for one word.
But when i try to search "Create Test 'account'" not sure how to use grep since im a newbie in Linux.
With a GNU grep, you can use
grep "Create Test 'account'\|Create Test \`account\`" test.txt
Here, the backticks are escaped since they are used inside a double quoted string where they are evaluated. The | regex alternation operator is escaped because it is considered a literal pipe char otherwise.
Details:
Create Test 'account' - a literal text
\| - or
Create Test `account` - a literal text

How to use sed between single quotes?

I have some short script which looks like this:
It's a way to execute bash inside a groovy command.
sh (script: 'printf "${INFO} | sed 's/^[^\/]*://g'"',returnStdout: true).trim()
The value of INFO is test/word/fine.
With the script above I want to 'delete' everything till (and including) the first /. I can not make it work with the single quotes between single quotes. If that works I can check if my \/ will work.
Apparently Groovy allows you to use triple quotes so you don't have to force the command to be in single single quotes (sic).
sh """printf "${INFO}" | sed 's/^[^\/]*//'"""
Notice also the placement of the double quotes in the printf command. A better still solution would be to say printf '%s' "${INFO}" but ... do you really need the shell to interpolate the value of the variable INFO, and if so, why are you not simply doing sh 'echo "${INFO#*/}"'?
If indeed you only want the first occurrence to be replaced, the /g flag is superfluous, so I took it out. Your regex is anchored to the beginning of the string so it will only ever find a single match to replace, but saying "replace all occurrences on a line" when apparently that's precisely not what you want is misleading and confusing at best.
If indeed your test data doesn't contain a colon, the colon in your regex was wrong, so I took that out, too.
Commonly, we use a different separator like s%^[^/]*/%% so we don't have to backslash-escape slashes in our sed substitutions.
Solution 1st: Following simple sed may help you on same.
echo "test/word/fine" | sed 's/\([^/]*\)\/\(.*\)/\2/'
Solution 2nd: No need to use sed use bash parameter expansion:
var="test/word/fine"
echo "${var#*/}"
word/fine

Grep regular expression - Pattern issue

I'm trying to using grep to try to find things from a given pattern. For instance I have these lines:
A secret word: CoolKapplan
A secret word: Kapplan
A secret word: Bungyjump
So if I get to know the first and last letter of a word. In this example I get 'K' - 'n'.
PATTERN = K.....n
I do this: grep -w -r -H --color=always "^$PATTERN" *
And I except it to only give me the lines containing the patterns that are starting with K. But that command would also include the first line, so the result would be:
A secret word: CoolKapplan
A secret word: Kapplan
How do I make it so it searches for a pattern and not give me the pattern that is included in another word?
After some more trial and error attempts I found out that you have to add '-o' flag for it to work.

Grep words with exact two vowels

I have the following issue, I need to retrieve all words that contains exactly 2 vowels (in any order) from a file. The file only contains one word per line.
My current workaround is:
Grep1: Retrieve words such as earth, over, under, one...
grep -i "^[aeiou][^aeiou]*[aeiou][^aeiou]*$" genesis.words > A.txt
and
Grep2: Retrieve words such as formless, deep, said...
grep -i "^[^aeiou][^aeiou]*[aeiou][^aeiou]*[aeiou][^aeiou]*$" genesis.words > B.txt
the above solution works but when I concatenate both regexs into a single regex then return nothing!
Mother of Grep1 & Grep2: should retrieve everything!
grep -i "^[aeiou][^aeiou]*[aeiou][^aeiou]*$|^[^aeiou][^aeiou]*[aeiou][^aeiou]*[aeiou][^aeiou]*$" genesis.words
I think issue is around my implementation of ^$ in expression but have tried diff versions with no sucess!
Any help will be highly appreciated!
OS is AIX 6100-09-04-1441
You were close. This should work:
grep -i "^[^aeiou]*[aeiou][^aeiou]*[aeiou][^aeiou]*$" genesis.words > A.txt
So it should find all eight possibilities (two vowels identify three nonvowel sequence, each possibly empty; 2^3 is 8):
[ ]I[ ]o[ ]
[ ]e[ ]a[r]
[ ]e[r]a[ ]
[ ]e[l]a[n]
[T]e[ ]a[ ]
[D]e[ ]a[r]
[D]e[w]a[r]
[D]a[w]a[ ]
[H]a[w]a[y]
As for concatenation, | needs escaping. You can use a single anchoring:
^(regexp1\|regexp2)$
Since the * can match 0 times or more you should be able to start the string with [^aeiou]*: try
"^[^aeiou]*[aeiou][^aeiou]*[aeiou][^aeiou]*$"
As for fixing your regex, I think you need to escape the bar as \|, so
grep -i "^[aeiou][^aeiou]*[aeiou][^aeiou]*$\|^[^aeiou][^aeiou]*[aeiou][^aeiou]*[aeiou][^aeiou]*$" genesis.words
If you don't mind Perl, you could use this:
perl -lne '$m=$_; tr/[aeiou]//cd; print $m if length()==2;' /usr/share/dict/words
That says... "save the current line (word) in $m. Delete everything that is not a vowel. Print the original word if there are two things (i.e vowels) left."
Note that I am using the system dictionary as input for my tests.
You could do pretty much the same thing in awk.
If you're able to use an alternative to grep tr with wc works well:
words=/path/to/words.txt
while read -e word ; do
v=$(echo $word | tr -cd 'aeiou' | wc -c)
[[ ! $v -eq "2" ]] || echo $word >> output.txt
done < $words
This reads the original file line by line, counts the vowels & returns results with only 2 to output.txt.

duplicate grep output when comparing two files

I have literally been at this for 5 hours, I have busybox on my device, and I unfortunately do not have -X in grep to make my life easier.
edit;
I have two list both of them have mac addresses, essentially I am just wanting to achieve offline mac address lookup so I don't have to keep looking it up online
list.txt has vendor mac prefix of course this isn't the complete list but just for an example
00:13:46
00:15:E9
00:17:9A
00:19:5B
00:1B:11
00:1C:F0
scan will have list of different mac addresses unknown to which vendor they go to. Which will be full length mac addresses. when ever there is a match I want the line in scan to be output.
Pretty much it does that, but it outputs everything from the scan file, and then it will output matching one at the end, and causing duplicate. I tried sort -u, but it has no effect its as if there is two different output from two different methods, the reason why I say that is because it will instantly output scan file that has everything in it, and couple seconds later it will output the matching one.
From searching I came across this
#!/bin/bash
while read line; do
grep -F 'list' 'scan'
done < list.txt
which displays the duplicate result when/if found, the output is pretty much echoing my scan file then displaying the matched pattern, this creating duplicate
This is frustrating me that I have not found a solution after click on all the links in google up to page 9.
Please someone help me.
I don't know if the Busybox sed supports this out of the box, but it should be easy to do in Awk or Perl instead then.
Create a sed script to print lines from file2 which are covered by a prefix in file1 by transforming each line in file1 into a sed command to print a match for that regular expression:
sed 's%.*%/&/p%' file1 | sed -n -f - file2
The same in Awk:
awk 'NR==FNR { a[++i]="^" $0; next }
{ for (j=1; j<=i; ++j) if ($0 ~ a[j]) print }' file1 file2
Ok guys I did a nested for loop (probably very in efficient) but I got it working printing the matching mac addresses using this
#!/usr/bin/bash
for scanlist in `cat scan | cut -d: -f1,2,3`
do
for listt in `cat list`
do
if [[ $scanlist == $listt ]]; then
grep $scanlist scan
fi
done
done
if anyone can make this more elegant but it works for me for now. I think the problem I had was one list contained just 00:11:22 while my other list contained 00:11:22:33:44:55 that is why I cut it on my scanlist to make same length as my other list. So this only output the matches instead of doing duplicate output.

Resources