Does Lua contain binary expressions like PHP? For example:
$v = 5;
for ($i=0; $i < $v; $i++) {
if($v & $i) {
echo $i." ";
}
}
Echo result:
1 3 4
If so, how to use them?
Since version 5.2 Lua comes with bit32 library. bit32.band is equivalent to & operator in php. LuaJIT also has bit operations.
Edit
Well, they're not exactly equivalent, but serve the same purpose.
See Lua logical operators as described at http://www.lua.org/manual/5.1/manual.html#2.5.3.
Related
Im simply wondering what the lua reference 5.4 reference manual meant by
The following strings denote other tokens:
+ - * / % ^ #
& ~ | << >> //
== ~= <= >= < > =
( ) { } [ ] ::
; : , . .. ...
These tokens are special keywords as well. Before this quote Lua Manual talks about variables and reserved keywords. Since these tokens doesn't contain letters, they are not listed in reserved keywords, so it calls them "other tokens"
I understand Lua does not have PCRE. How can I convert this into Lua?
# Quote shell chars
$a =~ s/[\002-\011\013-\032\\\#\?\`\(\)\{\}\[\]\^\*\<\=\>\~\|\; \"\!\$\&\'\202-\377]/\\$&/go;
# quote newline as '\n'
$a =~ s/[\n]/'\n'/go;
Is there a general converter that can convert any PCRE into Lua?
You may use
local a = "\002\003\004\005\006\007\008\009\010\011\012\\\n"
res, _ = a:gsub("([\002-\009\011-\026\\#?`(){}%[%]^*<>=~|; \"!$&'\130-\255])", "\\%1")
res, _ = res:gsub("\n", "'\n'")
print(res)
See Lua code demo
Note that in Lua patterns, \ is not a special char, % is used to replace special chars (like [) and \ddd escapes reference the decimal, not octal codes.
I have a pipe delimited feed file which has several fields. Since I only need a few, I thought of using awk to capture them for my testing purposes. However, I noticed that printf changes the value if I use "%d". It works fine if I use "%s".
Feed File Sample:
[jaypal:~/Temp] cat temp
302610004125074|19769904399993903|30|15|2012-01-13 17:20:02.346000|2012-01-13 17:20:03.307000|E072AE4B|587244|316|13|GSM|1|SUCC|0|1|255|2|2|0|213|2|0|6|0|0|0|0|0|10|16473840051|30|302610|235|250|0|7|0|0|0|0|0|10|54320058002|906|722310|2|0||0|BELL MOBILITY CELLULAR, INC|BELL MOBILITY CELLULAR, INC|Bell Mobility|AMX ARGENTINA SA.|Claro aka CTI Movil|CAN|ARG|
I am interested in capturing the second column which is 19769904399993903.
Here are my tests:
[jaypal:~/Temp] awk -F"|" '{printf ("%d\n",$2)}' temp
19769904399993904 # Value is changed
However, the following two tests works fine -
[jaypal:~/Temp] awk -F"|" '{printf ("%s\n",$2)}' temp
19769904399993903 # Value remains same
[jaypal:~/Temp] awk -F"|" '{print $2}' temp
19769904399993903 # Value remains same
So is this a limit of "%d" of not able to handle long integers. If thats the case why would it add one to the number instead of may be truncating it?
I have tried this with BSD and GNU versions of awk.
Version Info:
[jaypal:~/Temp] gawk --version
GNU Awk 4.0.0
Copyright (C) 1989, 1991-2011 Free Software Foundation.
[jaypal:~/Temp] awk --version
awk version 20070501
Starting with GNU awk 4.1 you can use --bignum or -M
$ awk 'BEGIN {print 19769904399993903}'
19769904399993904
$ awk --bignum 'BEGIN {print 19769904399993903}'
19769904399993903
ยง Command-Line Options
I believe the underlying numeric format in this case is an IEEE double. So the changed value is a result of floating point precision errors. If it is actually necessary to treat the large values as numerics and to maintain accurate precision, it might be better to use something like Perl, Ruby, or Python which have the capabilities (maybe via extensions) to handle arbitrary-precision arithmetic.
UPDATE: Recent versions of GNU awk support arbitrary precision arithmetic. See the GNU awk manual for more info.
ORIGINAL POST CONTENT:
XMLgawk supports arbitrary precision arithmetic on floating-point numbers.
So, if installing xgawk is an option:
zsh-4.3.11[drado]% awk --version |head -1; xgawk --version | head -1
GNU Awk 4.0.0
Extensible GNU Awk 3.1.6 (build 20080101) with dynamic loading, and with statically-linked extensions
zsh-4.3.11[drado]% awk 'BEGIN {
x=665857
y=470832
print x^4 - 4 * y^4 - 4 * y^2
}'
11885568
zsh-4.3.11[drado]% xgawk -lmpfr 'BEGIN {
MPFR_PRECISION = 80
x=665857
y=470832
print mpfr_sub(mpfr_sub(mpfr_pow(x, 4), mpfr_mul(4, mpfr_pow(y, 4))), 4 * y^2)
}'
1.0000000000000000000000000
This answer was partially answered by #Mark Wilkins and #Dennis Williamson already but I found out the largest 64-bit integer that can be handled without losing precision is 2^53.
Eg awk's reference page
http://www.gnu.org/software/gawk/manual/gawk.html#Integer-Programming
(sorry if my answer is too old. Figured I'd still share for the next person before they spend too much time on this like I did)
You're running into Awk's Floating Point Representation Issues. I don't think you can find a work-around within awk framework to perform arithmetic on huge numbers accurately.
Only possible (and crude) way I can think of is to break the huge number into smaller chunk, perform your math and join them again or better yet use Perl/PHP/TCL/bsh etc scripting languages that are more powerful than awk.
Using nawk on Solaris 11, I convert the number to a string by adding (concatenate) a null to the end, and then use %15s as the format string:
printf("%15s\n", bignum "")
another caveat about the precision :
the errors pile up with extra operations ::
echo 19769904399993903 | mawk2 '{ CONVFMT = "%.2000g";
OFMT = "%.20g";
} {
print;
print +$0;
print $0/1.0
print $0^1.0;
print exp(-log($0))^-1;
print exp(1*log($0))
print sqrt(exp(exp(log(20)-log(10))*log($0)))
print (exp(exp(log(6)-log(3))*log($0)))^2^-1
}'
19769904399993903
19769904399993904
19769904399993904
19769904399993904
19769904399993912
19769904399993908
19769904399993628 <<<โโ -275
19769904399993768 <<<โ- -135
The first few only off by less than 10.
last 2 equations have triple digit deltas.
For any of the versions that require calling helper math functions, simply getting the -M bignum flag is insufficient. One must also set the PREC variable.
For this exmaple, setting PREC=64 and OFMT="%.17g" should suffice.
Beware of setting OFMT too high, relative to PREC, otherwise you'll see oddities like this :
gawk -M -v PREC=256 -e '{ CONVFMT="%.2000g"; OFMT="%.80g";... } '
19769904399993903
19769904399993903.000000000000000000000000000000000000000000000000000000000003734
19769904399993903.000000000000000000000000000000000000000000000000000000000003734
19769904399993903.000000000000000000000000000000000000000000000000000000000003734
19769904399993903.000000000000000000000000000000000000000000000000000000000003734
since 80 significant digits require precision of at least 265.75, so basically 266-bits, but gawk is fast enough that you can probably safely pre-set it at PREC=4096/8192 instead of having to worry about it everytime
If I run grep -C 1 match over the following file:
a
b
match1
c
d
e
match2
f
match3
g
I get the following output:
b
match1
c
--
e
match2
f
match3
g
As you can see, since the context around the contiguous matches "match2" and "match3" overlap, they are merged. However, I would prefer to get one context description for each match, possibly duplicating lines from the input in the context reporting. In this case, what I would like is:
b
match1
c
--
e
match2
f
--
f
match3
g
What would be the best way to achieve this? I would prefer solutions which are general enough to be trivially adaptable to other grep options (different values for -A, -B, -C, or entirely different flags). Ideally, I was hoping that there was a clever way to do that just with grep....
I don't think it is possible to do that using plain grep.
the sed construct below works to some extent, now I only need to figure out how to add the "--" separator
$ sed -n -e '/match/{x;1!p;g;$!N;p;D;}' -e h log
b
match1
c
e
match2
f
f
match3
g
I don't think this is possible using plain grep.
Have you ever used Python? In my opinion it's a perfect language for such tasks (this code snippet will work for both Python 2.7 and 3.x):
with open("your_file_name") as f:
lines = [line.rstrip() for line in f.readlines()]
for num, line in enumerate(lines):
if "match" in line:
if num > 0:
print(lines[num - 1])
print(line)
if num < len(lines) - 1:
print(lines[num + 1])
if num < len(lines) - 2:
print("--")
This gives me:
b
match1
c
--
e
match2
f
--
f
match3
g
I'd suggest to patch grep instead of working around it. In GNU grep 2.9 in src/main.cpp:
933 /* We print the SEP_STR_GROUP separator only if our output is
934 discontiguous from the last output in the file. */
935 if ((out_before || out_after) && used && p != lastout && group_separator)
936 {
937 PR_SGR_START_IF(sep_color);
938 fputs (group_separator, stdout);
939 PR_SGR_END_IF(sep_color);
940 fputc('\n', stdout);
941 }
942
A simple additional flag would suffice here.
Edit: Well, d'oh, it is of course not THAT simple since grep would not reproduce the context, just add a few more separators. Due to the linearity of grep, the whole patch is probably not that easy. Nevertheless, if you have a good case for the patch, it could be worth it.
This does not appear possible with grep or GNU grep. However it is possible with standard POSIX tools and a good shell like bash as leverage to obtain the desired output.
Note: neither python nor perl should be necessary for the solution. Worst case, use awk or sed.
One solution I rapidly prototyped is something like this (it does involve overhead of re-reading the file, and this solution depends on whether this overhead is OK, and the give-away is the original question's use of -1 as fixed number of lines of context which allows simple use of head & tail) :
$ OIFS="$IFS"; lines=`grep -n match greptext.txt | /bin/cut -f1 -d:`;
for l in $lines;
do IFS=""; match=`/bin/tail -n +$(($l-1)) greptext.txt | /bin/head -3`;
echo $match; echo "---";
done; IFS="$OIFS"
This might have some corner case associated with it, and this resets IFS when perhaps not necessary, though it is a hint for trying to use the power of POSIX shell & tools rather than a high level interpreter to get the desired output.
Opinion: All good operating systems have: grep, awk, sed, tr, cut, head, tail, more, less, vi as built-ins. On the best operating systems, these are in /bin.
I'm trying to write a config-file parser for use in a non-standard C environment. Specifically, I can't rely on the utilities provided by <stdio.h>.
I'm looking to use Flex, but I need to use my own input structures rather than <stdio.h>'s FILE pointers.
you can define your own input method by defining the YY_INPUT method:
%{
#define YY_INPUT(buf,result,max_size) \
{ \
int c = getchar(); \
result = (c == EOF) ? YY_NULL : (buf[0] = c, 1); \
}
%}
Ragel is a generic state machine compiler, which you can use the generated code inside a C function. It has special support for building tokenizers.