Dart operator tear-off? - dart

bool allTrue(Iterable<bool> bools) => bools.reduce(&&);
Is the reason that one cannot do the above that operator tear-offs are not allowed in dart, cf. https://github.com/dart-lang/sdk/issues/27518.
And to actually make it work, is there a nicer way than to use the lambda as in
bool allTrue(Iterable<bool> bools) => bools.reduce((a, b) => a && b);
?

There are lots of reasons it's not allowed.
Dart doesn't allow operator tear-offs. You can tear off [1].add, but not 1.+.
Those are both instance methods. but there is no allowed syntax for naming the operator methods, only for invoking them.
It's definitely possible that Dart could add syntax to allow 1.+ to return, effectively, (num other) => 1 + other. Just hasn't happened yet.
What you write is different. You write && by itself, not as a member of any instance, and expect (a, b) => a && b. That's not an operator tear-off, that's just some kind of abstraction over operands.
Again, there is no strong reason Dart couldn't allow something like that, say any unary or binary operator as a full expression, e.g., in parentheses like (+) or (~), would effectively mean (a, b) => a + b or (a) => ~a , with types inferred from the context when possible.
Hasn't happened yet, and not really on anybody's list of priorities.
And then there is the problem that &&, || and ?? are short-circuiting operators. Even if you do (&&) to get (a, b) => a && b, calling that function is not the same as using && on two expressions, because the second argument is always evaluated in the function call.
Those operators are just not good candidates for allowing "operator-functionalization". You would use (&) and (|) instead, since they work exactly the same.
It's unlikely that even if Dart allowed (+), it would also allow (&&).

uh...
Yes (and no)
Sorry. That is a bit short for an answer, but what do you expect. It does not work this way. In any compiled language I know, and that includes Dart. Not because of that specific issue though. In general.
That said, operators are not functions. For example, they might short-circuit, something functions cannot do. If you could "tear off" an operator, how would that work? Again, that is the case in almost all compiled languages, nothing special to see here for Dart.

Related

Implementing Macro in a Rascal language project

Any idea on how to implement macro syntax with Rascal and also how to implement the typing and expansion(translation) of the macro syntax in Rascal? Any link to projects or repositories on this problem would also be appreciated.
Macro's are definitions of code substitutions in syntax trees, which is definitely one of the main features of Rascal. Questions I would have before advicing specific techniques:
adding macro's to an existing languages, or to a new language?
macro's at refactoring time, at compile-time or a run-time?
which would inform the question whether or not to implement macro's on concrete syntaxt trees or abstract syntax trees.
I would not say macros are a "problem" per se. The raw substitutions in syntax trees are trivial with Rascal. However, "hygienic macros" are more involved. Here we have to consider the capturing of variables by the expanded macro bodies, and what we can do about this (renaming) to avoid it. The literature on how to make macros hygienic is plenty. The complexity of hygienic macros depends on the type and name analysis (scoping) system of the base language that macros are added to.
If you have a DSL that you want to translate in stages to the target code, that can also be called "macros", but you will not find that name in the documentation. Here is an example: https://github.com/usethesource/flybytes/blob/main/src/lang/flybytes/macros/ControlFlow.rsc where "macro" is used to rewrite an additional AST node to its semantics in the "core" language.
The basic mechanisms are:
pattern matching: detects what you want to expand, with macros this is often a single ADT constructor but it can also be a more complex special case like matching i+=1 to substitute it with i++ .
substitution: at the location where the match was found, we create a new AST value in a simpler language but with the same semantics. This is done with AST expressions in Rascal, the => operator in visit and insert statements, and return and = in functions.
traversal: guiding the pattern matching and substitution without having to write to much boilerplate recursive functions.
Small example:
data Bool(loc src=|unknown:///|)
= \and(Bool l, Bool r)
| \or(Bool r, Bool r)
| \true()
| \false()
| \not(Bool a)
;
I extend the language with a "macro":
data Bool = impl(Bool l, Bool r)
A first option is to rewrite the constructor immediately and always with an overloaded function:
Bool impl(Bool l, Bool r) = or(not(l), r);
However, we lose some information here for debugging purposes, so let's try to keep the information intact:
Bool impl(Bool l, Bool r, src=loc s) = or(not(l), r, src=s);
Sometimes we want to delay the expansion for a specific stage in the compiler. In particular with the above "rewrite rule" a type-checker will not see the different anymore between ==> and || which sometimes creates usability issues with error messages.
In that case we wrap the expansion in a visit and stage it as a function:
Bool macroExpansion(Bool input) = visit(input) {
case impl(Bool l, Bool r, src=loc s) => or(not(l), r, src=s)
// add more rules here
}
It is also possible to encapsulate rewrite rules as reusable functions:
Bool expand1(impl(Bool l, Bool r, src=loc s) = or(not(l), r, src=s);
Bool expand2(not(not(Bool b))) = b;
and then pass those or apply those: (expand1 + expand2)(myBool)
So to wrap this up:
pattern matching is the key to macro expansion, patterns can be wrapped in functions or visit cases or both, and functions can be passed around and combined.
watch out to do some "origin tracking" and forward src fields to the right-hand sides of rewrite rules, otherwise the generated code does not know where it comes from.

What is the point of op_Quotation if it cannot be used?

According the F# specification for operator overloading
<# #> op_Quotation
<## ##> op_QuotationUntyped
is given as with many other operators. Unless I'm missing something I don't believe that I can use this for custom types, so why is it listed?
I think you are right that there is no way of actually using those as custom operators. I suspect those are treated as operators in case this was useful, at some point in the future of the language, for some clever new feature.
The documentation really merely explains how the names of the operators get encoded. For non-special operator names, F# encodes those in a systematic way. For the ones listed in the page, it has a special nicer name. Consider this type:
type X() =
static member (<^><>) (a:int,b:int) = a + b
static member (<# #>) (a:int,b:int) = a + b
If you look at the names of those members:
[ for m in typeof<X>.GetMembers() -> m.Name ]
You see that the first operator got compiled as op_LessHatGreaterLessGreater, while the second one as op_Quotation. So this is where the name memntioned in the table comes in - it is probably good this is documented somewhere, but I think you're right, that this is not particularly useful!

Starting a parser for scheme language

I am writing a basic parser for a Scheme interpreter and here are the definitions I have set up to define the various type of tokens:
# 1. Parens
Type:
PAREN
Subtype:
LEFT_PAREN
Value:
'('
# 2. Operators (<=, =, +, ...)
Type:
OPERATOR
Subtype:
EQUALS
Value:
'='
Arity:
2
# 3. Types (2.5, "Hello", #f, etc.)
Type:
DATA
Subtype:
NUMBER
Value:
2.4
# 4. Procedures, builtins, and such
Type:
KEYWORD
Subtype:
BUILTIN
Value:
"set"
Arity:
2
PROCEDURE:
... // probably need a new class for this
Does the above seem like it's a good starting place? Are there some obvious things I'm missing here, or does this give me a "good-enough" foundation?
Your approach makes distinctions which really don't exist in the syntax of the language, and also makes decisions far too early. For example consider this program:
(let ((x 1))
(with-assignment-notes
(set! x 2)
(set! x 3)
x))
When I run this:
> (let ((x 1))
(with-assignment-notes
(set! x 2)
(set! x 3)
x))
setting x to 2
setting x to 3
3
In order for this to work with-assignment-notes has to somehow redefine what (set! ...) means in its body. Here's a hacky and probably incorrect (Racket) implementation of that:
(define-syntax with-assignment-notes
(syntax-rules (set!)
[(_ form ...)
(let-syntax ([rewrite/maybe
(syntax-rules (set!)
[(_ (set! var val))
(let ([r val])
(printf "setting ~A to ~A~%" 'var r)
(set! var r))]
[(_ thing)
thing])])
(rewrite/maybe form) ...)]))
So the critical features of any parser for a Lisp-family language are:
it should not make any decision about the semantics of the language that it can avoid making;
the structure it constructs must be available to the language itself as first-class objects;
(and optionally) the parser should be modifiable from the language itself.
As examples:
it is probably inevitable that the parser needs to make decisions about what is and is not a number and what sort of number it is;
it would be nice if it had default handling for strings, but this should ideally be controllable by the user;
it should make no decision at all about what, say (< x y) means but rather should return a structure representing it for interpretation by the language.
The reason for the last, optional, requirement is that Lisp-family languages are used by people who are interested in using them for implementing languages. Allowing the reader to be altered from within the language makes that hugely easier, since you don't have to start from scratch each time you want to make a language which is a bit like the one you started with but not completely.
Parsing Lisp
The usual approach to parsing Lisp-family languages is to have machinery which will turn a sequence of characters into a sequence of s-expressions consisting of objects which are defined by the language itself, notably symbols and conses (but also numbers, strings &c). Once you have this structure you then walk over it to interpret it as a program: either evaluating it on the fly or compiling it. Critically, you can also write programs which manipulate this structure itself: macros.
In 'traditional' Lisps such as CL this process is explicit: there is a 'reader' which turns a sequence of characters into a sequence of s-expressions, and macros explicitly manipulate the list structure of these s-expressions, after which the evaluator/compiler processes them. So in a traditional Lisp (< x y) would be parsed as (a cons of a symbol < and (a cons of a symbol x and (a cons of a symbol y and the empty list object)), or (< . (x . (y . ()))), and this structure gets handed to the macro expander and hence to the evaluator or compiler.
In Scheme it is a little more subtle: macros are specified (portably, anyway) in terms of rules which turn a bit of syntax into another bit of syntax, and it's not (I think) explicit whether such objects are made of conses & symbols or not. But the structure which is available to syntax rules needs to be as rich as something made of conses and symbols, because syntax rules get to poke around inside it. If you want to write something like the following macro:
(define-syntax with-silly-escape
(syntax-rules ()
[(_ (escape) form ...)
(call/cc (λ (c)
(define (escape) (c 'escaped))
form ...))]
[(_ (escape val ...) form ...)
(call/cc (λ (c)
(define (escape) (c val ...))
form ...))]))
then you need to be able to look into the structure of what came from the reader, and that structure needs to be as rich as something made of lists and conses.
A toy reader: reeder
Reeder is a little Lisp reader written in Common Lisp that I wrote a little while ago for reasons I forget (but perhaps to help me learn CL-PPCRE, which it uses). It is emphatically a toy, but it is also small enough and simple enough to understand: certainly it is much smaller and simpler than the standard CL reader, and it demonstrates one approach to solving this problem. It is driven by a table known as a reedtable which defines how parsing proceeds.
So, for instance:
> (with-input-from-string (in "(defun foo (x) x)")
(reed :from in))
(defun foo (x) x)
Reeding
To read (reed) something using a reedtable:
look for the next interesting character, which is the next character not defined as whitespace in the table (reedtables have a configurable list of whitespace characters);
if that character is defined as a macro character in the table, call its function to read something;
otherwise call the table's token reader to read and interpret a token.
Reeding tokens
The token reader lives in the reedtable and is responsible for accumulating and interpreting a token:
it accumulates a token in ways known to itself (but the default one does this by just trundling along the string handling single (\) and multiple (|) escapes defined in the reedtable until it gets to something that is whitespace in the table);
at this point it has a string and it asks the reedtable to turn this string into something, which it does by means of token parsers.
There is a small kludge in the second step: as the token reader accumulates a token it keeps track of whether it is 'denatured' which means that there were escaped characters in it. It hands this information to the token parsers, which allows them, for instance, to interpret |1|, which is denatured, differently to 1, which is not.
Token parsers are also defined in the reedtable: there is a define-token-parser form to define them. They have priorities, so that the highest priority one gets to be tried first and they get to say whether they should be tried for denatured tokens. Some token parser should always apply: it's an error if none do.
The default reedtable has token parsers which can parse integers and rational numbers, and a fallback one which parses a symbol. Here is an example of how you would replace this fallback parser so that instead of returning symbols it returns objects called 'cymbals' which might be the representation of symbols in some embedded language:
Firstly we want a copy of the reedtable, and we need to remove the symbol parser from that copy (having previously checked its name using reedtable-token-parser-names).
(defvar *cymbal-reedtable* (copy-reedtable nil))
(remove-token-parser 'symbol *cymbal-reedtable*)
Now here's an implementation of cymbals:
(defvar *namespace* (make-hash-table :test #'equal))
(defstruct cymbal
name)
(defgeneric ensure-cymbal (thing))
(defmethod ensure-cymbal ((thing string))
(or (gethash thing *namespace*)
(setf (gethash thing *namespace*)
(make-cymbal :name thing))))
(defmethod ensure-cymbal ((thing cymbal))
thing)
And finally here is the cymbal token parser:
(define-token-parser (cymbal 0 :denatured t :reedtable *cymbal-reedtable*)
((:sequence
:start-anchor
(:register (:greedy-repetition 0 nil :everything))
:end-anchor)
name)
(ensure-cymbal name))
An example of this. Before modifying the reedtable:
> (with-input-from-string (in "(x y . z)")
(reed :from in :reedtable *cymbal-reedtable*))
(x y . z)
After:
> (with-input-from-string (in "(x y . z)")
(reed :from in :reedtable *cymbal-reedtable*))
(#S(cymbal :name "x") #S(cymbal :name "y") . #S(cymbal :name "z"))
Macro characters
If something isn't the start of a token then it's a macro character. Macro characters have associated functions and these functions get called to read one object, however they choose to do that. The default reedtable has two-and-a-half macro characters:
" reads a string, using the reedtable's single & multiple escape characters;
( reads a list or a cons.
) is defined to raise an exception, as it can only occur if there are unbalanced parens.
The string reader is pretty straightforward (it has a lot in common with the token reader although it's not the same code).
The list/cons reader is mildly fiddly: most of the fiddliness is dealing with consing dots which it does by a slightly disgusting trick: it installs a secret token parser which will parse a consing dot as a special object if a dynamic variable is true, but otherwise will raise an exception. The cons reader then binds this variable appropriately to make sure that consing dots are parsed only where they are allowed. Obviously the list/cons reader invokes the whole reader recursively in many places.
And that's all the macro characters. So, for instance in the default setup, ' would read as a symbol (or a cymbal). But you can just install a macro character:
(defvar *qr-reedtable* (copy-reedtable nil))
(setf (reedtable-macro-character #\' *qr-reedtable*)
(lambda (from quote table)
(declare (ignore quote))
(values `(quote ,(reed :from from :reedtable table))
(inch from nil))))
And now 'x will read as (quote x) in *qr-reedtable*.
Similarly you could add a more compllicated macro character on # to read objects depending on their next character in the way CL does.
An example of the quote reader. Before:
> (with-input-from-string (in "'(x y . z)")
(reed :from in :reedtable *qr-reedtable*))
\'
The object it has returned is a symbol whose name is "'", and it didn't read beyond that of course. After:
> (with-input-from-string (in "'(x y . z)")
(reed :from in :reedtable *qr-reedtable*))
`(x y . z)
Other notes
Everything works one-character-ahead, so all of the various functions get the stream being read, the first character they should be interested in and the reedtable, and return both their value and the next character. This avoids endlessly unreading characters (and probably tells you what grammar class it can handle natively (obviously macro character parsers can do whatever they like so long as things are sane when they return).
It probably doesn't use anything which isn't moderately implementable in non-Lisp languages. Some
Macros will cause pain in the usual way, but the only one is define-token-parser. I think the solution to that is the usual expand-the-macro-by-hand-and-write-that-code, but you could probably help a bit by having an install-or-replace-token-parser function which dealt with the bookkeeping of keeping the list sorted etc.
You'll need a language with dynamic variables to implement something like the cons reeder.
it uses CL-PPCRE's s-expression representation of regexps. I'm sure other languages have something like this (Perl does) because no-one wants to write stringy regexps: they must have died out decades ago.
It's a toy: it may be interesting to read but it's not suitable for any serious use. I found at least one bug while writing this: there will be many more.

How to invoke Erlang function with variable?

4> abs(1).
1
5> X = abs.
abs
6> X(1).
** exception error: bad function abs
7> erlang:X(1).
1
8>
Is there any particular reason why I have to use the module name when I invoke a function with a variable? This isn't going to work for me because, well, for one thing it is just way too much syntactic garbage and makes my eyes bleed. For another thing, I plan on invoking functions out of a list, something like (off the top of my head):
[X(1) || X <- [abs, f1, f2, f3...]].
Attempting to tack on various module names here is going to make the verbosity go through the roof, when the whole point of what I am doing is to reduce verbosity.
EDIT: Look here: http://www.erlangpatterns.org/chain.html The guy has made some pipe-forward function. He is invoking functions the same way I want to above, but his code doesn't work when I try to use it. But from what I know, the guy is an experienced Erlang programmer - I saw him give some keynote or whatever at a conference (well I saw it online).
Did this kind of thing used to work but not anymore? Surely there is a way I can do what I want - invoke these functions without all the verbosity and boilerplate.
EDIT: If I am reading the documentation right, it seems to imply that my example at the top should work (section 8.6) http://erlang.org/doc/reference_manual/expressions.html
I know abs is an atom, not a function. [...] Why does it work when the module name is used?
The documentation explains that (slightly reorganized):
ExprM:ExprF(Expr1,...,ExprN)
each of ExprM and ExprF must be an atom or an expression that
evaluates to an atom. The function is said to be called by using the
fully qualified function name.
ExprF(Expr1,...,ExprN)
ExprF
must be an atom or evaluate to a fun.
If ExprF is an atom the function is said to be called by using the implicitly qualified function name.
When using fully qualified function names, Erlang expects atoms or expression that evaluates to atoms. In other words, you have to bind X to an atom: X = atom. That's exactly what you provide.
But in the second form, Erlang expects either an atom or an expression that evaluates to a function. Notice that last word. In other words, if you do not use fully qualified function name, you have to bind X to a function: X = fun module:function/arity.
In the expression X=abs, abs is not a function but an atom. If you want thus to define a function,you can do so:
D = fun erlang:abs/1.
or so:
X = fun(X)->abs(X) end.
Try:
X = fun(Number) -> abs(Number) end.
Updated:
After looking at the discussion more, it seems like you're wanting to apply multiple functions to some input.
There are two projects that I haven't used personally, but I've starred on Github that may be what you're looking for.
Both of these projects use parse transforms:
fun_chain https://github.com/sasa1977/fun_chain
pipeline https://github.com/stolen/pipeline
Pipeline is unique because it uses a special syntax:
Result = [fun1, mod2:fun2, fun3] (Arg1, Arg2).
Of course, it could also be possible to write your own function to do this using a list of {module, function} tuples and applying the function to the previous output until you get the result.

When to prefer `and` over `andalso` in guard tests

I am curious why the comma ‹,› is a shortcut for and and not andalso in guard tests.
Since I'd call myself a “C native” I fail to see any shortcomings of short-circuit boolean evaluation.
I compiled some test code using the to_core flag to see what code is actually generated. Using the comma, I see the left hand value and right and value get evaluated and both and'ed. With andalso you have a case block within the case block and no call to erlang:and/2.
I did no benchmark tests but I daresay the andalso variant is the faster one.
To delve into the past:
Originally in guards there were only , separated tests which were evaluated from left-to-right until either there were no more and the guard succeeded or a test failed and the guard as a whole failed. Later ; was added to allow alternate guards in the same clause. If guards evaluate both sides of a , before testing then someone has gotten it wrong along the way. #Kay's example seems to imply that they do go from left-to-right as they should.
Boolean operators were only allowed much later in guards.
and, together with or, xor and not, is a boolean operator and was not intended for control. They are all strict and evaluate their arguments first, like the arithmetic operators +, -, * and '/'. There exist strict boolean operators in C as well.
The short-circuiting control operators andalso and orelse were added later to simplify some code. As you have said the compiler does expand them to nested case expressions so there is no performance gain in using them, just convenience and clarity of code. This would explain the resultant code you saw.
N.B. in guards there are tests and not expressions. There is a subtle difference which means that while using and and andalso is equivalent to , using orelse is not equivalent to ;. This is left to another question. Hint: it's all about failure.
So both and and andalso have their place.
Adam Lindbergs link is right. Using the comma does generate better beam code than using andalso. I compiled the following code using the +to_asm flag:
a(A,B) ->
case ok of
_ when A, B -> true;
_ -> false
end.
aa(A,B) ->
case ok of
_ when A andalso B -> true;
_ -> false
end.
which generates
{function, a, 2, 2}.
{label,1}.
{func_info,{atom,andAndAndalso},{atom,a},2}.
{label,2}.
{test,is_eq_exact,{f,3},[{x,0},{atom,true}]}.
{test,is_eq_exact,{f,3},[{x,1},{atom,true}]}.
{move,{atom,true},{x,0}}.
return.
{label,3}.
{move,{atom,false},{x,0}}.
return.
{function, aa, 2, 5}.
{label,4}.
{func_info,{atom,andAndAndalso},{atom,aa},2}.
{label,5}.
{test,is_atom,{f,7},[{x,0}]}.
{select_val,{x,0},{f,7},{list,[{atom,true},{f,6},{atom,false},{f,9}]}}.
{label,6}.
{move,{x,1},{x,2}}.
{jump,{f,8}}.
{label,7}.
{move,{x,0},{x,2}}.
{label,8}.
{test,is_eq_exact,{f,9},[{x,2},{atom,true}]}.
{move,{atom,true},{x,0}}.
return.
{label,9}.
{move,{atom,false},{x,0}}.
return.
I only looked into what is generated with the +to_core flag, but obviously there is a optimization step between to_core and to_asm.
It's an historical reason. and was implemented before andalso, which was introduced in Erlang 5.1 (the only reference I can find right now is EEP-17). Guards have not been changed because of backwards compatibility.
The boolean operators "and" and "or" always evaluate arguements on both the sides of the operator. Whereas if you want the functionality of C operators && and || (where 2nd arguement is evaluated only if needed..for eg if we want to evalue "true orelse false" as soon as true is found to be the first arguement, the second arguement will not be evaluated which is not the case had "or" been used ) go for "andalso" and "orelse".

Resources