When to prefer `and` over `andalso` in guard tests - erlang

I am curious why the comma ‹,› is a shortcut for and and not andalso in guard tests.
Since I'd call myself a “C native” I fail to see any shortcomings of short-circuit boolean evaluation.
I compiled some test code using the to_core flag to see what code is actually generated. Using the comma, I see the left hand value and right and value get evaluated and both and'ed. With andalso you have a case block within the case block and no call to erlang:and/2.
I did no benchmark tests but I daresay the andalso variant is the faster one.

To delve into the past:
Originally in guards there were only , separated tests which were evaluated from left-to-right until either there were no more and the guard succeeded or a test failed and the guard as a whole failed. Later ; was added to allow alternate guards in the same clause. If guards evaluate both sides of a , before testing then someone has gotten it wrong along the way. #Kay's example seems to imply that they do go from left-to-right as they should.
Boolean operators were only allowed much later in guards.
and, together with or, xor and not, is a boolean operator and was not intended for control. They are all strict and evaluate their arguments first, like the arithmetic operators +, -, * and '/'. There exist strict boolean operators in C as well.
The short-circuiting control operators andalso and orelse were added later to simplify some code. As you have said the compiler does expand them to nested case expressions so there is no performance gain in using them, just convenience and clarity of code. This would explain the resultant code you saw.
N.B. in guards there are tests and not expressions. There is a subtle difference which means that while using and and andalso is equivalent to , using orelse is not equivalent to ;. This is left to another question. Hint: it's all about failure.
So both and and andalso have their place.

Adam Lindbergs link is right. Using the comma does generate better beam code than using andalso. I compiled the following code using the +to_asm flag:
a(A,B) ->
case ok of
_ when A, B -> true;
_ -> false
end.
aa(A,B) ->
case ok of
_ when A andalso B -> true;
_ -> false
end.
which generates
{function, a, 2, 2}.
{label,1}.
{func_info,{atom,andAndAndalso},{atom,a},2}.
{label,2}.
{test,is_eq_exact,{f,3},[{x,0},{atom,true}]}.
{test,is_eq_exact,{f,3},[{x,1},{atom,true}]}.
{move,{atom,true},{x,0}}.
return.
{label,3}.
{move,{atom,false},{x,0}}.
return.
{function, aa, 2, 5}.
{label,4}.
{func_info,{atom,andAndAndalso},{atom,aa},2}.
{label,5}.
{test,is_atom,{f,7},[{x,0}]}.
{select_val,{x,0},{f,7},{list,[{atom,true},{f,6},{atom,false},{f,9}]}}.
{label,6}.
{move,{x,1},{x,2}}.
{jump,{f,8}}.
{label,7}.
{move,{x,0},{x,2}}.
{label,8}.
{test,is_eq_exact,{f,9},[{x,2},{atom,true}]}.
{move,{atom,true},{x,0}}.
return.
{label,9}.
{move,{atom,false},{x,0}}.
return.
I only looked into what is generated with the +to_core flag, but obviously there is a optimization step between to_core and to_asm.

It's an historical reason. and was implemented before andalso, which was introduced in Erlang 5.1 (the only reference I can find right now is EEP-17). Guards have not been changed because of backwards compatibility.

The boolean operators "and" and "or" always evaluate arguements on both the sides of the operator. Whereas if you want the functionality of C operators && and || (where 2nd arguement is evaluated only if needed..for eg if we want to evalue "true orelse false" as soon as true is found to be the first arguement, the second arguement will not be evaluated which is not the case had "or" been used ) go for "andalso" and "orelse".

Related

Dart operator tear-off?

bool allTrue(Iterable<bool> bools) => bools.reduce(&&);
Is the reason that one cannot do the above that operator tear-offs are not allowed in dart, cf. https://github.com/dart-lang/sdk/issues/27518.
And to actually make it work, is there a nicer way than to use the lambda as in
bool allTrue(Iterable<bool> bools) => bools.reduce((a, b) => a && b);
?
There are lots of reasons it's not allowed.
Dart doesn't allow operator tear-offs. You can tear off [1].add, but not 1.+.
Those are both instance methods. but there is no allowed syntax for naming the operator methods, only for invoking them.
It's definitely possible that Dart could add syntax to allow 1.+ to return, effectively, (num other) => 1 + other. Just hasn't happened yet.
What you write is different. You write && by itself, not as a member of any instance, and expect (a, b) => a && b. That's not an operator tear-off, that's just some kind of abstraction over operands.
Again, there is no strong reason Dart couldn't allow something like that, say any unary or binary operator as a full expression, e.g., in parentheses like (+) or (~), would effectively mean (a, b) => a + b or (a) => ~a , with types inferred from the context when possible.
Hasn't happened yet, and not really on anybody's list of priorities.
And then there is the problem that &&, || and ?? are short-circuiting operators. Even if you do (&&) to get (a, b) => a && b, calling that function is not the same as using && on two expressions, because the second argument is always evaluated in the function call.
Those operators are just not good candidates for allowing "operator-functionalization". You would use (&) and (|) instead, since they work exactly the same.
It's unlikely that even if Dart allowed (+), it would also allow (&&).
uh...
Yes (and no)
Sorry. That is a bit short for an answer, but what do you expect. It does not work this way. In any compiled language I know, and that includes Dart. Not because of that specific issue though. In general.
That said, operators are not functions. For example, they might short-circuit, something functions cannot do. If you could "tear off" an operator, how would that work? Again, that is the case in almost all compiled languages, nothing special to see here for Dart.

What's the purpose of this gen_server.erl code?

unregister_name({local,Name}) ->
_ = (catch unregister(Name));
unregister_name({global,Name}) ->
_ = global:unregister_name(Name);
unregister_name({via, Mod, Name}) ->
_ = Mod:unregister_name(Name);
unregister_name(Pid) when is_pid(Pid) ->
Pid.
This is from gen_server.erl. If _ always matches and the match always evaluates to the right hand side expression, what are the _ = expression() lines doing here?
Typically _ = ... matches are used to quiet dialyzer warnings about unmatched function return values when its -Wunmatched_returns option is used. As the documentation explains:
-Wunmatched_returns
Include warnings for function calls which ignore a structured return value or
do not match against one of many possible return value(s).
By explicitly matching the return value against the _ "don't care" variable, you can use this useful dialyzer option without having to see warnings for return values you don't care about.
In Erlang, last expression of function is its return value, so someone might be tempted to check, what global:unregister_name/1 or Mod:unregister_name(Name) return and try to pattern match on that.
The _ = expression() doesn't do anything in particular, but hints, that this return value should be ignored (for example, because they are not documented and might be subject to change). However in the last expression, Pid is returned explicitly. This means, that you can pattern match like this:
case unregister_name(Something) of
Pid when is_pid(Pid) -> foo();
_ -> bar()
end.
To sum up: those lines aren't doing anything there, but when someone else is reading the source code, they show original programmer intent.
Unfortunately, this particular function is not exported and in the original module never used in pattern match, so I don't have an example to back this up :)
And I'll note that I've since come across this:
The Power of Ten – Rules for Developing Safety Critical Code
Gerard J. Holzmann
NASA/JPL Laboratory for Reliable Software Pasadena, CA
91109
[...]
Rule: The return value of non-void functions must be checked by each calling function, and the validity of parameters must be checked
inside each function.
Rationale: This is possibly the most frequently
violated rule, and therefore somewhat more suspect as a general rule.
In its strictest form, this rule means that even the return value of
printf statements and file close statements must be checked. One can
make a case, though, that if the response to an error would rightfully
be no different than the response to success, there is little point in
explicitly checking a return value. This is often the case with calls
to printf and close. In cases like these, it can be acceptable to
explicitly cast the function return value to (void) – thereby
indicating that the programmer explicitly and not accidentally decides
to ignore a return value. In more dubious cases, a comment should be
present to explain why a return value is irrelevant. In most cases,
though, the return value of a function should not be ignored,
especially if error return values must be propagated up the function
call chain. Standard libraries famously violate this rule with
potentially grave consequences. See, for instance, what happens if you
accidentally execute strlen(0), or strcat(s1, s2, -1) with the
standard C string library – it is not pretty. By keeping the general
rule, we make sure that exceptions must be justified, with mechanical
checkers flagging violations. Often, it will be easier to comply with
the rule than to explain why noncompliance might be acceptable.

Why is "do" allowed inside a function?

I noticed that the following code compiles and works in VS 2013:
let f() =
do Console.WriteLine(41)
42
But when looking at the F# 3.0 specification I can't find any mention of do being used this way. As far as I can tell, do can have the following uses:
As a part of loop (e.g. while expr do expr done), that's not the case here.
Inside computation expressions, e.g.:
seq {
for i in 1..2 do
do Console.WriteLine(i)
yield i * 2
}
That's not the case here either, f doesn't contain any computation expressions.
Though what confuses me here is that according to the specification, do should be followed by in. That in should be optional due to lightweight syntax, but adding it here causes a compile error (“Unexpected token 'in' or incomplete expression”).
Statement inside a module or class. This is also not the case here, the do is inside a function, not inside a module or a class.
I also noticed that with #light "off", the code doesn't compile (“Unexpected keyword 'do' in binding”), but I didn't find anything that would explain this in the section on lightweight syntax either.
Based on all this, I would assume that using do inside a function this way should not compile, but it does. Did I miss something in the specification? Or is this actually a bug in the compiler or in the specification?
From the documentation on MSDN:
A do binding is used to execute code without defining a function or value.
Even though the spec doesn't contain a comprehensive list of the places it is allowed, it is merely an expression asserted to be of type unit. Some examples:
if ((do ()); true) then ()
let x: unit = do ()
It is generally omitted. Each of the preceding examples are valid without do. Therefore, do serves only to assert that an expression is of type unit.
Going through the F# 3.0 specification expression syntax has do expr as a choice of class-function-or-value-defn (types) [Ch 8, A.2.5] and module-function-or-value-defn (modules) [Ch 10, A.2.1.1].
I don't actually see in the spec where function-defn can have more than one expression, as long all but the last one evaluate to unit -- or that all but the last expression is ignored in determining the functions return value.
So, it seems this is an oversight in the documentation.

Haskell/Parsec: How do you use the functions in Text.Parsec.Indent?

I'm having trouble working out how to use any of the functions in the Text.Parsec.Indent module provided by the indents package for Haskell, which is a sort of add-on for Parsec.
What do all these functions do? How are they to be used?
I can understand the brief Haddock description of withBlock, and I've found examples of how to use withBlock, runIndent and the IndentParser type here, here and here. I can also understand the documentation for the four parsers indentBrackets and friends. But many things are still confusing me.
In particular:
What is the difference between withBlock f a p and
do aa <- a
pp <- block p
return f aa pp
Likewise, what's the difference between withBlock' a p and do {a; block p}
In the family of functions indented and friends, what is ‘the level of the reference’? That is, what is ‘the reference’?
Again, with the functions indented and friends, how are they to be used? With the exception of withPos, it looks like they take no arguments and are all of type IParser () (IParser defined like this or this) so I'm guessing that all they can do is to produce an error or not and that they should appear in a do block, but I can't figure out the details.
I did at least find some examples on the usage of withPos in the source code, so I can probably figure that out if I stare at it for long enough.
<+/> comes with the helpful description “<+/> is to indentation sensitive parsers what ap is to monads” which is great if you want to spend several sessions trying to wrap your head around ap and then work out how that's analogous to a parser. The other three combinators are then defined with reference to <+/>, making the whole group unapproachable to a newcomer.
Do I need to use these? Can I just ignore them and use do instead?
The ordinary lexeme combinator and whiteSpace parser from Parsec will happily consume newlines in the middle of a multi-token construct without complaining. But in an indentation-style language, sometimes you want to stop parsing a lexical construct or throw an error if a line is broken and the next line is indented less than it should be. How do I go about doing this in Parsec?
In the language I am trying to parse, ideally the rules for when a lexical structure is allowed to continue on to the next line should depend on what tokens appear at the end of the first line or the beginning of the subsequent line. Is there an easy way to achieve this in Parsec? (If it is difficult then it is not something which I need to concern myself with at this time.)
So, the first hint is to take a look at IndentParser
type IndentParser s u a = ParsecT s u (State SourcePos) a
I.e. it's a ParsecT keeping an extra close watch on SourcePos, an abstract container which can be used to access, among other things, the current column number. So, it's probably storing the current "level of indentation" in SourcePos. That'd be my initial guess as to what "level of reference" means.
In short, indents gives you a new kind of Parsec which is context sensitive—in particular, sensitive to the current indentation. I'll answer your questions out of order.
(2) The "level of reference" is the "belief" referred in the current parser context state of where this indentation level starts. To be more clear, let me give some test cases on (3).
(3) In order to start experimenting with these functions, we'll build a little test runner. It'll run the parser with a string that we give it and then unwrap the inner State part using an initialPos which we get to modify. In code
import Text.Parsec
import Text.Parsec.Pos
import Text.Parsec.Indent
import Control.Monad.State
testParse :: (SourcePos -> SourcePos)
-> IndentParser String () a
-> String -> Either ParseError a
testParse f p src = fst $ flip runState (f $ initialPos "") $ runParserT p () "" src
(Note that this is almost runIndent, except I gave a backdoor to modify the initialPos.)
Now we can take a look at indented. By examining the source, I can tell it does two things. First, it'll fail if the current SourcePos column number is less-than-or-equal-to the "level of reference" stored in the SourcePos stored in the State. Second, it somewhat mysteriously updates the State SourcePos's line counter (not column counter) to be current.
Only the first behavior is important, to my understanding. We can see the difference here.
>>> testParse id indented ""
Left (line 1, column 1): not indented
>>> testParse id (spaces >> indented) " "
Right ()
>>> testParse id (many (char 'x') >> indented) "xxxx"
Right ()
So, in order to have indented succeed, we need to have consumed enough whitespace (or anything else!) to push our column position out past the "reference" column position. Otherwise, it'll fail saying "not indented". Similar behavior exists for the next three functions: same fails unless the current position and reference position are on the same line, sameOrIndented fails if the current column is strictly less than the reference column, unless they are on the same line, and checkIndent fails unless the current and reference columns match.
withPos is slightly different. It's not just a IndentParser, it's an IndentParser-combinator—it transforms the input IndentParser into one that thinks the "reference column" (the SourcePos in the State) is exactly where it was when we called withPos.
This gives us another hint, btw. It lets us know we have the power to change the reference column.
(1) So now let's take a look at how block and withBlock work using our new, lower level reference column operators. withBlock is implemented in terms of block, so we'll start with block.
-- simplified from the actual source
block p = withPos $ many1 (checkIndent >> p)
So, block resets the "reference column" to be whatever the current column is and then consumes at least 1 parses from p so long as each one is indented identically as this newly set "reference column". Now we can take a look at withBlock
withBlock f a p = withPos $ do
r1 <- a
r2 <- option [] (indented >> block p)
return (f r1 r2)
So, it resets the "reference column" to the current column, parses a single a parse, tries to parse an indented block of ps, then combines the results using f. Your implementation is almost correct, except that you need to use withPos to choose the correct "reference column".
Then, once you have withBlock, withBlock' = withBlock (\_ bs -> bs).
(5) So, indented and friends are exactly the tools to doing this: they'll cause a parse to immediately fail if it's indented incorrectly with respect to the "reference position" chosen by withPos.
(4) Yes, don't worry about these guys until you learn how to use Applicative style parsing in base Parsec. It's often a much cleaner, faster, simpler way of specifying parses. Sometimes they're even more powerful, but if you understand Monads then they're almost always completely equivalent.
(6) And this is the crux. The tools mentioned so far can only do indentation failure if you can describe your intended indentation using withPos. Quickly, I don't think it's possible to specify withPos based on the success or failure of other parses... so you'll have to go another level deeper. Fortunately, the mechanism that makes IndentParsers work is obvious—it's just an inner State monad containing SourcePos. You can use lift :: MonadTrans t => m a -> t m a to manipulate this inner state and set the "reference column" however you like.
Cheers!

Why no dynamic bit pattern in function argument?

I am experimenting with bit pattern matching in Erlang:
-module(test).
-export([test/2]).
%test(P,<<X:P,0:1>>) ->
% X.
test(P,X) ->
<<Y:P,0:1>> = X,
Y.
when compiling the commented out version of test/2 i get a complaint that "variable 'P' is unbound".
Is there any good reason for not allowing the first version to work the same as the second?
Because in the commented out version P is a length - for it to work Erlang would need to perform a double match - match the value of the 2nd parameter with a pattern which is undetermined...
The question you are asking in a clause pattern match is "is this the clause for me" - you can't 'pop into the clause' and then back out if it isn't...
In the second example X is bound before the match, you are committed to going into the clause and if <<Y:P,0:1>> don't match X, well crash time!
The reason is that arguments to the function are evaluated independent of each other. The correctness of bindings to variables in only checked as a second step.
This means that in your first example P will be unbound when evaluating the second argument, which is against the rules of pattern matching. In contrast, in your second example, P is bound at the time of evaluating the pattern match on the binary.

Resources