searching multiple words from a .tr file - grep

I need to calculate the received packets from .tr file. The problem is that one string is necessary for me but some unnecessary events are also counted.
So I want a solution.
line1: r 0.500000000 1 RTR — 0 cbr 210 [0 0 0 0] ——- [1:0 5:0 32 0] [0] 0 0
line2: r 0.501408175 3 RTR — 0 AODV 48 [0 ffffffff 1 800] ——- [1:255 -1:255 30 0] [0x2 1 1 [5 0] [1 4]] (REQUEST)
I want only line 1 but as I am searching for ‘^r’ only so both files are returned. Please help me how can I search the line where 2 patterns are needed?

You can use grep to match more than one expression at a time, or to return results if one of the expressions exist.
OR alike - if foo or bar matches:
grep -e "foo|bar" file
AND alike - if foo and bar matches:
grep "foo" file | grep "bar"
As you need 2 patterns, I will go for the latest. But still, I think you may improve your question putting an example of what you expect your code to return, and what exact command are you currently using.
You may also define a grep command to find the occurrences of foo, but that does not include bar. Maybe that is easier, depending on what you need.
grep "foo" file | grep -v "bar"

Related

lua: global var vs table entry var

In lua when u have a function in a table, what is the difference between declaring a global variable within the function vs declaring the variable as an entry in the table (if any)? The variable is x in the example below.
i.e.
dog={x=33,
func=function(self)
self.x=self.x*self.x
end
}
cat={func=function()
x=33
x=x*x
end
}
In dog I can use the properties of self to call the function with dog:func() instead of dog.func(dog). But outside of that, is there anything performance-wise to take into consideration in choosing between the two? The examples work a bit different when called in a loop, but outside of that?
Well, I heard that the two first rules about optimization are "Don't do it!" and "Don't do it yet!".
There is an official document exposing some ways to optimize Lua code and I recommend it. The most important rule is to prefer local variables to global variables because global variables are 30% slower than local ones.
The first thing we can do with the previous code is to compile it and check the bytecode instructions to understand what happen at the execution time. I stored the first function inside "test-1.lua" and the second one in "test-2.lua".
> cat test-1.lua
dog={x=33,
func=function(self)
self.x=self.x*self.x
end
}
function TEST ()
dog:func()
end
> luac54 -l -s test-1.lua
#
#(part of output omitted for clarity)
#
# Function: dog.func
#
function <test-1.lua:2,4> (6 instructions at 0000000000768740)
1 param, 3 slots, 0 upvalues, 1 local, 1 constant, 0 functions
1 [3] GETFIELD 1 0 0 ; "x"
2 [3] GETFIELD 2 0 0 ; "x"
3 [3] MUL 1 1 2
4 [3] MMBIN 1 2 8 ; __mul
5 [3] SETFIELD 0 0 1 ; "x"
6 [4] RETURN0
#
# Function: TEST (function to call dog.func)
#
function <test-1.lua:7,9> (4 instructions at 00000000000a8a90)
0 params, 2 slots, 1 upvalue, 0 locals, 2 constants, 0 functions
1 [8] GETTABUP 0 0 0 ; _ENV "dog"
2 [8] SELF 0 0 1k ; "func"
3 [8] CALL 0 2 1 ; 1 in 0 out
4 [9] RETURN0
So, if we want to execute TEST 10 times, we will need to execute at least 10*(4+6) bytecode instructions, that's said 100 bytecode instructions.
> cat test-2.lua
cat={func=function()
x=x*x
end
}
x=33
function TEST ()
cat.func()
end
> luac54 -l -s test-2.lua
#
#(part of output omitted for clarity)
#
# Function: cat.func
#
function <test-2.lua:1,3> (6 instructions at 00000000001b87f0)
0 params, 2 slots, 1 upvalue, 0 locals, 1 constant, 0 functions
1 [2] GETTABUP 0 0 0 ; _ENV "x"
2 [2] GETTABUP 1 0 0 ; _ENV "x"
3 [2] MUL 0 0 1
4 [2] MMBIN 0 1 8 ; __mul
5 [2] SETTABUP 0 0 0 ; _ENV "x"
6 [3] RETURN0
#
# Function: TEST (function to call cat.func)
#
function <test-2.lua:8,10> (4 instructions at 00000000001b8a80)
0 params, 2 slots, 1 upvalue, 0 locals, 2 constants, 0 functions
1 [9] GETTABUP 0 0 0 ; _ENV "cat"
2 [9] GETFIELD 0 0 1 ; "func"
3 [9] CALL 0 1 1 ; 0 in 0 out
4 [10] RETURN0
So, if we want to execute TEST 10 times, we will need to execute at least 10*(4+6) bytecode instructions, that's said 100 bytecode instructions.... which is exactly the same as the first version!
Obviously, all the bytecode instructions does not take the same time to execute. Some instructions will spend much more time in the C runtime the other ones. The addition of two integer might be much faster than allocating a new table and initialize some fields. At that point, we could try to do a dirty-and-pointless microbenchmark to give us an idea.
One might copy and paste this code in a Lua interpreter:
> cat dirty-and-pointess-benchmark.lua
dog={x=33,
func=function(self)
self.x=self.x*self.x
end
}
cat={func=function()
x=x*x
end
}
x=33
function StartMeasure ()
StartTime = os.clock()
end
function StopMeasure (TestName)
local Duration = os.clock() - StartTime
print(string.format("%s: %f sec", TestName, Duration))
end
function DoTest1 (Count)
for Index = 1, Count do
dog:func()
end
end
function DoTest2 (Count)
for Index = 1, Count do
cat.func()
end
end
COUNT = 5000000000
StartMeasure()
DoTest1(COUNT)
StopMeasure("VERSION_1")
StartMeasure()
DoTest2(COUNT)
StopMeasure("VERSION_2")
This code give this results on my computer:
VERSION_1: 246.816000 sec
VERSION_2: 250.412000 sec
Obviously, the difference is probably negligible for the most of the programs. We should always try to spend more time on writing correct programs and less time to do micro-benchmarks.
The two code snippets do very different things. dog.func sets self.x to the square of its previous value. cat.func sets the global x to 1089. You can't really compare performance between two things whose functionality are so different.
First of all you should change
cat={func=function()
x=33
x=x*x
end
}
to
x=33
cat={func=function()
x=x*x
end
}
Now we have the same operations.
If I run both functions 10000 times I end up with cat.func() a few percent slower than dog:func()
This does not surprise as indexing locals is faster than indexing globals.
To speed up cat you could do something like this:
x=33
cat={func=function()
local _x = x
x = _x*_x
end
}
The fastest solution is probably
dog={x=33,
func=function(self)
local x = self.x
self.x = x*x
end
}
and you could even gain more speed by making your tables and x local.
Usually you don't win anything significant doing things like that.
Premature optimiziation is a big no-no and you should ask yourself what problem you're actually trying to solve.
It also doesn't make sense to squeeze the last percent out of your code if you do not even know enough Lua to write a simple benchmark for your code... just a thought.

full outer join with awk

After reading:
Combine two files with unequal length on common column with multiple matches with linux command line
I wonder how you would do a full outer join.
(hopefully it's ok to start a new question with it)
file one
A 1
C 4
file two
A 2
B 5
file three
A 7
D 9
the result would be:
A 1 2 7
B N 5 N
C 4 N N
D N N 9
Is there an awk-one-line-solution like I saw for left outer join?
With GNU awk for true multi-dimensional arrays, ARGIND, and sorted_in:
$ cat tst.awk
{ vals[$1][ARGIND] = $2 }
END {
PROCINFO["sorted_in"] = "#ind_str_asc"
for (key in vals) {
printf "%s%s", key, OFS
for (fileNr=1; fileNr<=ARGIND; fileNr++) {
val = (fileNr in vals[key] ? vals[key][fileNr] : "N")
printf "%s%s", val, (fileNr<ARGIND ? OFS : ORS)
}
}
}
$ awk -f tst.awk file1 file2 file3
A 1 2 7
B N 5 N
C 4 N N
D N N 9
It's possible to solve this using the POSIX-standard join command. Given that the three files proposed in the question are named 1.txt, 2.txt and 3.txt:
join -a1 -a2 -eN -o 0,1.2,2.2 1.txt 2.txt | join -a1 -a2 -eN -o 0,1.2,1.3,2.2 - 3.txt
Note that while this provides the required output for the question as stated, it's quite a bit less flexible than Ed Morton's awk-based solution. For one thing, join needs its input files to be sorted on the shared field(s). And for another, it will only work for exactly three input files.
However, it's possibly simpler to use join in once-off ad hoc cases!

Creating numbered variable names using the foreach command

I have a list of variables for which I want to create a list of numbered variables. The intent is to use these with the reshape command to create a stacked data set. How do I keep them in order? For instance, with this code
local ct = 1
foreach x in q61 q77 q99 q121 q143 q165 q187 q209 q231 q253 q275 q297 q306 q315 q324 q333 q342 q351 q360 q369 q378 q387 q396 q405 q414 q423 {
gen runs`ct' = `x'
local ct = `ct' + 1
}
when I use the reshape command it generates an order as
runs1 runs10 runs11 ... runs2 runs22 ...
rather than the desired
runs01 runs02 runs03 ... runs26
Preserving the order is necessary in this analysis. I'm trying to add a leading zero to all ct values less than 10 when assigning variable names.
Generating a series of identifiers with leading zeros is a documented and solved problem: see e.g. here.
local j = 1
foreach v in q61 q77 q99 q121 q143 q165 q187 q209 q231 q253 q275 q297 q306 q315 q324 q333 q342 q351 q360 q369 q378 q387 q396 q405 q414 q423 {
local J : di %02.0f `j'
rename `v' runs`J'
local ++j
}
Note that I used rename rather than generate. If you are going to reshape the variables afterwards, the labour of copying the contents is unnecessary. Indeed the default float type for numeric variables used by generate could in some circumstances result in loss of precision.
I note that there may also be a solution with rename groups.
All that said, it's hard to follow your complaint about what reshape does (or does not) do. If you have a series of variables like runs* the most obvious reshape is a reshape long and for example
clear
set obs 1
gen id = _n
foreach v in q61 q77 q99 q121 q143 {
gen `v' = 42
}
reshape long q, i(id) j(which)
list
+-----------------+
| id which q |
|-----------------|
1. | 1 61 42 |
2. | 1 77 42 |
3. | 1 99 42 |
4. | 1 121 42 |
5. | 1 143 42 |
+-----------------+
works fine for me; the column order information is preserved and no use of rename was needed at all. If I want to map the suffixes to 1 up, I can just use egen, group().
So, that's hard to discuss without a reproducible example. See
https://stackoverflow.com/help/mcve for how to post good code examples.

Functional impact of declaring local variables via function parameters

In writing some one-off Lua code for an answer, I found myself code golfing to fit a function on a single line. While this code did not fit on one line...
foo=function(a,b) local c=bob; some_code_using_c; return c; end
...I realized that I could just make it fit by converting it to:
foo=function(a,b,c) c=bob; some_code_using_c; return c; end
Are there any performance or functional implications of using a function parameter to declare a function-local variable (assuming I know that a third argument will never be passed to the function) instead of using local? Do the two techniques ever behave differently?
Note: I included semicolons in the above for clarity of concept and to aid those who do not know Lua's handling of whitespace. I am aware that they are not necessary; if you follow the link above you will see that the actual code does not use them.
Edit Based on #Oka's answer, I compared the bytecode generated by these two functions, in separate files:
function foo(a,b)
local c
return function() c=a+b+c end
end
function foo(a,b,c)
-- this line intentionally blank
return function() c=a+b+c end
end
Ignoring addresses, the byte code report is identical (except for the number of parameters listed for the function).
You can go ahead and look at the Lua bytecode generated by using luac -l -l -p my_file.lua, comparing instruction sets and register layouts.
On my machine:
function foo (a, b)
local c = a * b
return c + 2
end
function bar (a, b, c)
c = a * b
return c + 2
end
Produces:
function <f.lua:1,4> (4 instructions at 0x80048fe0)
2 params, 4 slots, 0 upvalues, 3 locals, 1 constant, 0 functions
1 [2] MUL 2 0 1
2 [3] ADD 3 2 -1 ; - 2
3 [3] RETURN 3 2
4 [4] RETURN 0 1
constants (1) for 0x80048fe0:
1 2
locals (3) for 0x80048fe0:
0 a 1 5
1 b 1 5
2 c 2 5
upvalues (0) for 0x80048fe0:
function <f.lua:6,9> (4 instructions at 0x800492b8)
3 params, 4 slots, 0 upvalues, 3 locals, 1 constant, 0 functions
1 [7] MUL 2 0 1
2 [8] ADD 3 2 -1 ; - 2
3 [8] RETURN 3 2
4 [9] RETURN 0 1
constants (1) for 0x800492b8:
1 2
locals (3) for 0x800492b8:
0 a 1 5
1 b 1 5
2 c 1 5
upvalues (0) for 0x800492b8:
Not very much difference, is there? If I'm not mistaken, there's just a slightly different declaration location specified for each c, and the difference in the params size, as one might expect.

New lines in word definition using interpreter directives of Gforth

I am using the interpreter directives (non ANS standard) control structures of Gforth as described in the manual section 5.13.4 Interpreter Directives. I basically want to use the loop words to create a dynamically sized word containing literals. I came up with this definition for example:
: foo
[ 10 ] [FOR]
1
[NEXT]
;
Yet this produces an Address alignment exception after the [FOR] (yes, I know you should not use a for loop in Forth at all. This is just for an easy example).
In the end it turned out that you have to write loops as one-liners in order to ensure their correct execution. So doing
: foo [ 10 [FOR] ] 1 [ [NEXT] ] ;
instead works as intended. Running see foo yields:
: foo
1 1 1 1 1 1 1 1 1 1 1 ; ok
which is exactly what I want.
Is there a way to get new lines in the word definition? The words I would like to write are way more complex, and for a presentation I would need them better formatted.
It would really be best to use an immediate word instead. For example,
: ones ( n -- ) 0 ?do 1 postpone literal loop ; immediate
: foo ( -- ten ones ) [ 10 ] ones ;
With SEE FOO resulting in the same as your example. With POSTPONE, especially with Gforth's ]] .. [[ syntax, the repeated code can be as elaborate as you like.
A multiline [FOR] would need to do four things:
Use REFILL to read in subsequent lines.
Save the read-in lines, because you'll need to evaluate them one by one to preserve line-expecting parsing behavior (such as from comments: \ ).
Stop reading in lines, and loop, when you match the terminating [NEXT].
Take care to leave >IN right after the [NEXT] so that interpretation can continue normally.
You might still run into issues with some code, like code checking SOURCE-ID.
For an example of using REFILL to parse across multiple lines, here's code from a recent posting from CLF, by Gerry:
: line, ( u1 caddr2 u2 -- u3 )
tuck here swap chars dup allot move +
;
: <text>  ( "text" -- caddr u )
here 0
begin
refill
while
bl word count s" </text>" compare
while
0 >in ! source line, bl c, 1+
repeat then
;
This collects everything between <text> and a </text> that's on its own line, as with a HERE document, while also adding spaces. To save the individual lines for [FOR] in an easy way, I'd recommend leaving 0 as a sentinel on the data stack and then drop SAVE-MEM 'd lines on top of it.

Resources