I'm looking for zoom to understand why this:
palabra=s_gonzalez
i=10
awk -vvar1=$palabra -vvvar2=$i '( $1 == var1 ) && ( $2 == var2 ) {print $0}' As
is not printing anything. The As file contains:
r_castillo 10
flores 6
s_gonzalez 10
o_gutzwiller 12
h_ji 4
Thanks in advance for any suggestion.
Where're your:
vvar2
Did you misspell var2?
As a technique to avoid this sort of problem, you can assign variables without -v. I would rewrite the command:
awk '$1==var1 && $2==var2' var1=$palabra var2=$i As
It always seems simpler to me to assign variables as arguments after the program rather than as -v options before the program. (-v assignments are available in the BEGIN block, but that is irrelevant in this case.)
Related
I have the following issue, I need to retrieve all words that contains exactly 2 vowels (in any order) from a file. The file only contains one word per line.
My current workaround is:
Grep1: Retrieve words such as earth, over, under, one...
grep -i "^[aeiou][^aeiou]*[aeiou][^aeiou]*$" genesis.words > A.txt
and
Grep2: Retrieve words such as formless, deep, said...
grep -i "^[^aeiou][^aeiou]*[aeiou][^aeiou]*[aeiou][^aeiou]*$" genesis.words > B.txt
the above solution works but when I concatenate both regexs into a single regex then return nothing!
Mother of Grep1 & Grep2: should retrieve everything!
grep -i "^[aeiou][^aeiou]*[aeiou][^aeiou]*$|^[^aeiou][^aeiou]*[aeiou][^aeiou]*[aeiou][^aeiou]*$" genesis.words
I think issue is around my implementation of ^$ in expression but have tried diff versions with no sucess!
Any help will be highly appreciated!
OS is AIX 6100-09-04-1441
You were close. This should work:
grep -i "^[^aeiou]*[aeiou][^aeiou]*[aeiou][^aeiou]*$" genesis.words > A.txt
So it should find all eight possibilities (two vowels identify three nonvowel sequence, each possibly empty; 2^3 is 8):
[ ]I[ ]o[ ]
[ ]e[ ]a[r]
[ ]e[r]a[ ]
[ ]e[l]a[n]
[T]e[ ]a[ ]
[D]e[ ]a[r]
[D]e[w]a[r]
[D]a[w]a[ ]
[H]a[w]a[y]
As for concatenation, | needs escaping. You can use a single anchoring:
^(regexp1\|regexp2)$
Since the * can match 0 times or more you should be able to start the string with [^aeiou]*: try
"^[^aeiou]*[aeiou][^aeiou]*[aeiou][^aeiou]*$"
As for fixing your regex, I think you need to escape the bar as \|, so
grep -i "^[aeiou][^aeiou]*[aeiou][^aeiou]*$\|^[^aeiou][^aeiou]*[aeiou][^aeiou]*[aeiou][^aeiou]*$" genesis.words
If you don't mind Perl, you could use this:
perl -lne '$m=$_; tr/[aeiou]//cd; print $m if length()==2;' /usr/share/dict/words
That says... "save the current line (word) in $m. Delete everything that is not a vowel. Print the original word if there are two things (i.e vowels) left."
Note that I am using the system dictionary as input for my tests.
You could do pretty much the same thing in awk.
If you're able to use an alternative to grep tr with wc works well:
words=/path/to/words.txt
while read -e word ; do
v=$(echo $word | tr -cd 'aeiou' | wc -c)
[[ ! $v -eq "2" ]] || echo $word >> output.txt
done < $words
This reads the original file line by line, counts the vowels & returns results with only 2 to output.txt.
I want a command that can match all the below criteria in Red Hat:
·number range between 0100xxxx to 0110xxxxx
·And have money over 300
·Status either X or Z
·id contains letter ‘a’
·Error_code starting with 2
number,money,status,error-code,id
010018739,13213,X,300,abcde
010523456,343,Z,500,xcvfe
010743576,563,X,201,fgsa
012095654,300,X,400,gcaz
019432343,300,X,402,dewa
011023324,200,X,206,dea
020023433,100,X,303,a
010832134,300,X,200,a
012244242,433,Z,204,ghfsa
Something like this:
awk -F, '($1>=1000000 && $1<11099999) && $2>300 && ($3 ~ "X" || $3 ~ "Z") && index($5,"a") && index($4,"2")==1' file
It doesn't cater for the status being lower-case (but you didn't ask for that), nor does it cater for there being spaces in front of the status or error code (but you didn't ask for that either).
grep only matches text, awk is much more flexible and should fit your case better. For instance:
awk 'BEGIN {FS=","} $2 > 300 {print;}' < yourfile
Basically this is saying that ',' is the field separator, and then for every line where the second field ($2) is > 300, the action (in this case just print the whole line, which could even be omitted IIRC) is executed.
You can have conditions as complex as you like, with a syntax that is similar to C. I would suggest reading man awk and googling for more complex examples, but you should get the idea.
I'm totally new to AWK, however I think this is the best way to solve my problem and a good time to learn AWK.
I am trying to read a large data file that is created by a simulation program. The output is made to be readable by humans, so its formatting isn't very consistent. An example of the output is in this image
http://i.imgur.com/0kf8l.png
I need a way to find a line like "He 2 4686A -2.088 0.0071", by specifying the "He 2 4686A" part and get the following two numbers. The problem is the line "He 2 4686A -2.088 0.0071" can appear anywhere in the table.
I know how to find the entry "He 2 4686A", but I don't know which of the 4 columns it's in. So I don't know how to address the values that follow it.
A command that lets me just read the next two words, or tells me the location of the pattern once a match is found will both help.
/He 2 4686A/ finds the line
Ca A 3970A -0.900 0.1100 He 2 4686A -2.088 0.0071 S 3 18.67m -0.371 0.3721 Ar 4 444.7A -2.124 0.0066
Any help is appreciated.
First step should be to bring what seems to be 4 columns of records into a 1-column format...then its easy with awk because you can then filter for the first 5 fields - like:
echo "He 2 4686A -2.088 0.0071" | \
awk '$1 == "He" && $2 == 2 && $3 == "4686A" {print $4, $5}'
which gives
-2.088 0.0071
So, for me, the only challenge is to transform your data to one-column format...And from the picture that look simple because it seems that the columns have a fixed length which you can count.
Assuming that your column-width is 30 characters (difficult to tell from a picture, beware of tabs) and you data is in input_file, then you could first "cut" the data into 4 columns and then pipe the output to another awk-process
awk '{
print substr($0,1,30)
print substr($0,31,30)
print substr($0,61,30)
print substr($0,91,30)
}' input_file | \
awk '$1 == "He" && $2 == 2 && $3 == "4686A" {print $4, $5}'
If you really just need the next two numbers behind an anchor then I would say the grep-solution from Costa is best for you, however this gives you the possibility to implement further logic...
If you're not dead set on using awk, grep would be the easiest way...
egrep -o "He 2 4686A \-?[0-9.]+ \-?[0-9.]+" output.txt
EDIT: The above would work only if the spacing was done with a whitespace, which doesn't seem to be your case. In order to handle tabs and/or repeating whitespaces...
egrep -o "He[ \t]+2[ \t]+4686A[ \t]+\-?[0-9.]+[ \t]+\-?[0-9.]+" output.txt
I need to find some matching conditions from a file and recursively find the next conditions in previously matched files , i have something like this
input.txt
123
22
33
The files where you need to find above terms in following files, the challenge is if 123 is found in say 10 files , the 22 should be searched in these 10 files only and so on...
Example of files are like f1,f2,f3,f4.....f1200
so it is like i need to grep -w "123" f* | grep -w "123" | .....
its not possible to list them manually so any easier way?
You can solve this using awk script, i ve encountered a similar problem and this will work fine
awk '{ if(!NR){printf("grep -w %d f*|",$1)} else {printf("grep -w %d f*",$1)} }' input.txt | sh
What it Does?
it reads input.txt line by line
until it is at last record , it prints grep -w %d | (note there is a
pipe here)
which is then sent to shell for execution and results are piped back
to back
and when you reach the end the pipe is avoided
Perhaps taking a meta-programming viewpoint would help. Have grep output a series of grep commands. Or write a little PERL program. Maybe Ruby, if the mood suits.
You can use grep -lw to write the list of file names that matched (note that it will stop after finding the first match).
You capture the list of file names and use that for the next iteration in a loop.
With a "normal" (i mean "full") linux distro, it works just fine:
sleep $(echo "$[ ($RANDOM % 9 ) ]")
ok, it waits for about 0-9 sec
but under OpenWRT [not using bash, rather "ash"]:
$ sleep $(echo "$[ ($RANDOM % 9 ) ]") sleep: invalid number '$[' $
and why:
$ echo "$[ ($RANDOM % 9 ) ]" $[ ( % 9 ) ] $
So does anyone has a way to generate random numbers under OpenWRT, so i can put it in the "sleep"?
Thank you
You might try something like this:
sleep `head /dev/urandom | tr -dc "0123456789" | head -c1`
Which works on my WhiteRussian OpenWRT router.
I actually don't know if this will always return a number, but when it does, it will always return 0-9, and only 1 digit (you could make it go up to 99 if you made the second head -c2).
Good luck!
you could also use awk
sleep $(awk 'BEGIN{srand();print int(rand()*9)}')
For some scenarios, this might not yield a sufficient diversity of answers. Another approach is to use /dev/urandom directly (eg https://www.2uo.de/myths-about-urandom/):
echo $(hexdump -n 4 -e '"%u"' </dev/urandom)
When using awk, note that awk uses the time of day as the seed (https://linux.die.net/man/1/awk). This might be relevant for scenarios where the time of day is reset (eg no battery backed time of day clock), or synchronised across a fleet (eg group restart).
srand([expr])
Uses expr as a new seed for the random number generator. If no expr is provided, the time of day is used. The return value is the previous seed for the random number generator.
This is confirmed by looking at the source in busybox (https://github.com/mirror/busybox/blob/master/editors/awk.c):
seed = op1 ? (unsigned)L_d : (unsigned)time(NULL);
At least for some versions of Openwrt, it seems an explicit call to srand() is required to avoid obtaining the same answers repeatedly:
# awk 'BEGIN{print rand(), rand()}'
0 0.345001
# awk 'BEGIN{print rand(), rand()}'
0 0.345001