Logical OR vs Logical AND: which should be more binding? - parsing

I'm writing a small parser, which will have an OR operator and an AND operator. When you see a series of ORs and ANDs, which do you expect will be more binding? Given the expression a & b | c, do you expect it to mean (a&b)|c or a&(b|c)? Can you give any reason to prefer one over the other?

Do what everyone else does; AND binds tighter than OR (see e.g. C Operator Precedence Table). This is the convention that everyone expects, so adopt the principle of least surprise.
This choice isn't arbitrary. It stems from the fact that AND and OR follow a similar relationship to multiply and add, respectively; see e.g. http://en.wikipedia.org/wiki/Boolean_logic#Other_notations.
Note also that users of your language should be heavily encouraged to use parentheses to make their intentions clear to readers of their code. But that's up to them!

AND and OR in Boolean algebra are equivalent to * and - in regular algebra, so it makes sense that AND binds harder than OR just like * binds harder than +:
A B A*B A&B A+B A|B
0 0 0 0 0 0
0 1 0 0 1 1
1 0 0 0 1 1
1 1 1 1 1(>0) 1

If you consider it like you would discrete maths, I'd say PEMDAS leads you to say that the AND is more binding. That's not always the case though.
I recommend you recommending your users to use parentheses wherever there's ambiguity.

Usually & has a precedence over | in many scenarios. But you can restrict expressions to be in a full parenthesis form.

Related

How to compare all possible group combinations with EMMEANS in SPSS?

Suppose you have a 2x2 design and you're testing differences between those 4 groups using ANOVA in SPSS.
This is a graph of your data:
After performing ANOVA, there are 6 possible pairwise comparisons between groups that we can perform. These are:
A - C
B - D
A - D
B - C
A - B
C - D
If I want to perform pairwise comparisons, I would usually use this script after the UNIANOVA command:
/EMMEANS=TABLES(Var1*Var2) COMPARE (Var1) ADJ(LSD)
/EMMEANS=TABLES(Var1*Var2) COMPARE (Var2) ADJ(LSD)
However, after running this script, the output only contains 4 of the 6 possible comparisons - there are two pairwise comparisons that are missing, and those are:
A - B
C - D
How can I calculate those comparisons?
EMMEANS in UNIANOVA does not provide all pairwise comparisons among the cells in an interaction like this. There are some other procedures, such as GENLIN, that do offer these, but use large-sample chi-square statistics rather than t or F statistics. In UNIANOVA, you can get these using the LMATRIX subcommand, or you can use some trickery with EMMEANS.
For the trickery with EMMEANS, create a single factor with four levels that index the 2x2 layout of cells, then handle that as a one-way model. The main effect for that is the same as the overall 3 degree of freedom model for the 2x2 layout, and of course EMMEANS with COMPARE works fine on that.
Without creating a new variable, you can use LMATRIX with:
/LMATRIX "(1,1) - (2,2)" var1 1 -1 var2 1 -1 var1*var2 1 0 0 -1
/LMATRIX "(1,2) - (2,1)" var1 1 -1 var1 -1 1 var1*var2 0 1 -1 0
The quoted pieces are labels, indicating the cells in the 2x2 design being compared.
Another trick you can use to make specifying the LMATRIX simpler, but without creating a new variable, is to specify the DESIGN with just the interaction term and suppress the intercept. That makes the parameter estimates just the four cell means:
UNIANOVA Y BY var1 var2
/INTERCEPT=EXCLUDE
/DESIGN var1*var1
/LMATRIX "(1,1) - (2,2)" var1*var2 1 0 0 -1
/LMATRIX "(1,2) - (2,1)" var1*var1 0 1 -1 0.
In this case the one effect shown in the ANOVA table is a 4 df effect testing all means against 0, so it's not of interest, but the comparisons you want are easily obtained. Note that this trick only works with procedures that don't reparameterize to full rank.

Prime factorization of integers with Maxima

I want to use Maxima to get the prime factorization of a random positive integer, e.g. 12=2^2*3^1.
What I have tried so far:
a:random(20);
aa:abs(a);
fa:ifactors(aa);
ka:length(fa);
ta:1;
pfza: for i:1 while i<=ka do ta:ta*(fa[i][1])^(fa[i][2]);
ta;
This will be implemented in STACK for Moodle as part of a online exercise for students, so the exact implementation will be a little bit different from this, but I broke it down to these 7 lines.
I generate a random number a, make sure that it is a positive integer by using aa=|a|+1 and want to use the ifactors command to get the prime factors of aa. ka tells me the number of pairwise distinct prime factors which I then use for the while loop in pfza. If I let this piece of code run, it returns everything fine, execpt for simplifying ta, that is I don't get ta as a product of primes with some exponents but rather just ta=aa.
I then tried to turn off the simplifier, manually simplifying everything else that I need:
simp:false$
a:random(20);
aa:ev(abs(a),simp);
fa:ifactors(aa);
ka:ev(length(fa),simp);
ta:1;
pfza: for i:1 while i<=ka do ta:ta*(fa[i][1])^(fa[i][2]);
ta;
This however does not compile; I assume the problem is somewhere in the line for pfza, but I don't know why.
Any input on how to fix this? Or another method of getting the factorizing in a non-simplified form?
(1) The for-loop fails because adding 1 to i requires 1 + 1 to be simplified to 2, but simplification is disabled. Here's a way to make the loop work without requiring arithmetic.
(%i10) for f in fa do ta:ta*(f[1]^f[2]);
(%o10) done
(%i11) ta;
2 2 1
(%o11) ((1 2 ) 2 ) 3
Hmm, that's strange, again because of the lack of simplification. How about this:
(%i12) apply ("*", map (lambda ([f], f[1]^f[2]), fa));
2 1
(%o12) 2 3
In general I think it's better to avoid explicit indexing anyway.
(2) But maybe you don't need that at all. factor returns an unsimplified expression of the kind you are trying to construct.
(%i13) simp:true;
(%o13) true
(%i14) factor(12);
2
(%o14) 2 3
I think it's conceptually inconsistent for factor to return an unsimplified, but anyway it seems to work here.

Is there a language construct in F# for testing if a number is between two other numbers (in a range)?

I am looking for a more succinct F# equivalent of:
myNumber >= 2 && myNumber <= 4
I imagine something like
myNumber >=< (2, 4)
Is there some kind of operation like this?
There is no native operator, but you could define your own one.
let inline (>=<) a (b,c) = a >= b && a<= c
John's answer is exactly what you asked for, and the most practical solution. But this got me wondering if one could define operator(s) to enable a syntax closer to normal mathematical notation, i.e., a <= b <= c.
Here's one such solution:
let inline (<=.) left middle = (left <= middle, middle)
let inline (.<=) (leftResult, middle) right = leftResult && (middle <= right)
let inline (.<=.) middleLeft middleRight = (middleLeft .<= middleRight, middleRight)
1 <=. 3 .<=. 5 .<= 9 // true
1 <=. 10 .<= 5 // false
A few comments on this:
I used the . character to indicate the "middle" of the expression
. was a very deliberate choice, and is not easily changeable to some other character you like better (e.g. if you perhaps like the look of 1 <=# 3 #<= 5 better). The F# compiler changes the associativity and/or precedence of an operator based on the operator symbol's first character. We want standard left-to-right evaluation/short-circuiting, and . enables this.
A 3-number comparison is optimized away completely, but a 4+ number comparison results in CIL that allocates tuples and does various other business that isn't strictly necessary:
Is there some kind of operation like this?
Great question! The answer is "no", there isn't, but I wish there was.
Latkin's answer is nice, but it doesn't short-circuit evaluate. So if the first test fails the remaining subexpressions still get evaluated, even though their results are irrelevant.
FWIW, in Mathematica you can do 1<x<2 just like mathematics.

Multiset Partition Using Linear Arithmetic and Z3

I have to partition a multiset into two sets who sums are equal. For example, given the multiset:
1 3 5 1 3 -1 2 0
I would output the two sets:
1) 1 3 3
2) 5 -1 2 1 0
both of which sum to 7.
I need to do this using Z3 (smt2 input format) and "Linear Arithmetic Logic", which is defined as:
formula : formula /\ formula | (formula) | atom
atom : sum op sum
op : = | <= | <
sum : term | sum + term
term : identifier | constant | constant identifier
I honestly don't know where to begin with this and any advice at all would be appreciated.
Regards.
Here is an idea:
1- Create a 0-1 integer variable c_i for each element. The idea is c_i is zero if element is in the first set, and 1 if it is in the second set. You can accomplish that by saying that 0 <= c_i and c_i <= 1.
2- The sum of the elements in the first set can be written as 1*(1 - c_1) + 3*(1 - c_2) + ... +
3- The sum of the elements in the second set can be written as 1*c1 + 3*c2 + ...
While SMT-Lib2 is quite expressive, it's not the easiest language to program in. Unless you have a hard requirement that you have to code directly in SMTLib2, I'd recommend looking into other languages that have higher-level bindings to SMT solvers. For instance, both Haskell and Scala have libraries that allow you to script SMT solvers at a much higher level. Here's how to solve your problem using the Haskell, for instance: https://gist.github.com/1701881.
The idea is that these libraries allow you to code at a much higher level, and then perform the necessary translation and querying of the SMT solver for you behind the scenes. (If you really need to get your hands onto the SMTLib encoding of your problem, you can use these libraries as well, as they typically come with the necessary API to dump the SMTLib they generate before querying the solver.)
While these libraries may not offer everything that Z3 gives you access to via SMTLib, they are much easier to use for most practical problems of interest.

Can a SHA-1 hash be all-zeroes?

Is there any input that SHA-1 will compute to a hex value of fourty-zeros, i.e. "0000000000000000000000000000000000000000"?
Yes, it's just incredibly unlikely. I.e. one in 2^160, or 0.00000000000000000000000000000000000000000000006842277657836021%.
Also, becuase SHA1 is cryptographically strong, it would also be computationally unfeasible (at least with current computer technology -- all bets are off for emergent technologies such as quantum computing) to find out what data would result in an all-zero hash until it occurred in practice. If you really must use the "0" hash as a sentinel be sure to include an appropriate assertion (that you did not just hash input data to your "zero" hash sentinel) that survives into production. It is a failure condition your code will permanently need to check for. WARNING: Your code will permanently be broken if it does.
Depending on your situation (if your logic can cope with handling the empty string as a special case in order to forbid it from input) you could use the SHA1 hash ('da39a3ee5e6b4b0d3255bfef95601890afd80709') of the empty string. Also possible is using the hash for any string not in your input domain such as sha1('a') if your input has numeric-only as an invariant. If the input is preprocessed to add any regular decoration then a hash of something without the decoration would work as well (eg: sha1('abc') if your inputs like 'foo' are decorated with quotes to something like '"foo"').
I don't think so.
There is no easy way to show why it's not possible. If there was, then this would itself be the basis of an algorithm to find collisions.
Longer analysis:
The preprocessing makes sure that there is always at least one 1 bit in the input.
The loop over w[i] will leave the original stream alone, so there is at least one 1 bit in the input (words 0 to 15). Even with clever design of the bit patterns, at least some of the values from 0 to 15 must be non-zero since the loop doesn't affect them.
Note: leftrotate is circular, so no 1 bits will get lost.
In the main loop, it's easy to see that the factor k is never zero, so temp can't be zero for the reason that all operands on the right hand side are zero (k never is).
This leaves us with the question whether you can create a bit pattern for which (a leftrotate 5) + f + e + k + w[i] returns 0 by overflowing the sum. For this, we need to find values for w[i] such that w[i] = 0 - ((a leftrotate 5) + f + e + k)
This is possible for the first 16 values of w[i] since you have full control over them. But the words 16 to 79 are again created by xoring the first 16 values.
So the next step could be to unroll the loops and create a system of linear equations. I'll leave that as an exercise to the reader ;-) The system is interesting since we have a loop that creates additional equations until we end up with a stable result.
Basically, the algorithm was chosen in such a way that you can create individual 0 words by selecting input patterns but these effects are countered by xoring the input patterns to create the 64 other inputs.
Just an example: To make temp 0, we have
a = h0 = 0x67452301
f = (b and c) or ((not b) and d)
= (h1 and h2) or ((not h1) and h3)
= (0xEFCDAB89 & 0x98BADCFE) | (~0x98BADCFE & 0x10325476)
= 0x98badcfe
e = 0xC3D2E1F0
k = 0x5A827999
which gives us w[0] = 0x9fb498b3, etc. This value is then used in the words 16, 19, 22, 24-25, 27-28, 30-79.
Word 1, similarly, is used in words 1, 17, 20, 23, 25-26, 28-29, 31-79.
As you can see, there is a lot of overlap. If you calculate the input value that would give you a 0 result, that value influences at last 32 other input values.
The post by Aaron is incorrect. It is getting hung up on the internals of the SHA1 computation while ignoring what happens at the end of the round function.
Specifically, see the pseudo-code from Wikipedia. At the end of the round, the following computation is done:
h0 = h0 + a
h1 = h1 + b
h2 = h2 + c
h3 = h3 + d
h4 = h4 + e
So an all 0 output can happen if h0 == -a, h1 == -b, h2 == -c, h3 == -d, and h4 == -e going into this last section, where the computations are mod 2^32.
To answer your question: nobody knows whether there exists an input that produces all zero outputs, but cryptographers expect that there are based upon the simple argument provided by daf.
Without any knowledge of SHA-1 internals, I don't see why any particular value should be impossible (unless explicitly stated in the description of the algorithm). An all-zero value is no more or less probable than any other specific value.
Contrary to all of the current answers here, nobody knows that. There's a big difference between a probability estimation and a proof.
But you can safely assume it won't happen. In fact, you can safely assume that just about ANY value won't be the result (assuming it wasn't obtained through some SHA-1-like procedures). You can assume this as long as SHA-1 is secure (it actually isn't anymore, at least theoretically).
People doesn't seem realize just how improbable it is (if all humanity focused all of it's current resources on finding a zero hash by bruteforcing, it would take about xxx... ages of the current universe to crack it).
If you know the function is safe, it's not wrong to assume it won't happen. That may change in the future, so assume some malicious inputs could give that value (e.g. don't erase user's HDD if you find a zero hash).
If anyone still thinks it's not "clean" or something, I can tell you that nothing is guaranteed in the real world, because of quantum mechanics. You assume you can't walk through a solid wall just because of an insanely low probability.
[I'm done with this site... My first answer here, I tried to write a nice answer, but all I see is a bunch of downvoting morons who are wrong and can't even tell the reason why are they doing it. Your community really disappointed me. I'll still use this site, but only passively]
Contrary to all answers here, the answer is simply No.
The hash value always contains bits set to 1.

Resources