Erlang vs Elixir Macros - erlang

I have came across some Erlang code which I am trying to convert to Elixir to help me learn both of the languages and understand the differences. Macros and metaprogramming in general is a topic I am still trying to get my head around, so hopefully you will understand my confusion.
The Erlang code
-define(p2(MAT, REP),
p2(W = MAT ++ STM) -> m_rep(0, W, STM, REP))
% where m_rep is a function already defined.
To me, it seems that in the above code, there is two separate definitions of the p2 macro that map to a private function called m_rep. In Elixir though, it seems that it is only possible to have one pattern matching definition. Is it possible to have different ones in Elixir too?

These are not two definitions. The first line is the macro, the second line is the replacement. The confusing bit is that the macro has the same name as the function for which it is generating clauses. For example when using your macro like this:
?p2("a", "b");
?p2("c", "d").
the above will be expanded to:
p2(w = "a" ++ stm) -> m_rep(0, w, stm, "b");
p2(w = "c" ++ stm) -> m_rep(0, w, stm, "d").
You can use erlc -P to produce a .P file that will show you the effects of macro expansion on your code. Check out this slightly simpler, compilable example:
-module(macro).
-export([foo/1]).
-define(foo(X),
foo(X) -> X).
?foo("bar");
?foo("baz");
?foo("qux").
Using erlc -P macro.erl you will get the following output to macro.P:
-file("macro.erl", 1).
-module(macro).
-export([foo/1]).
foo("bar") ->
"bar";
foo("baz") ->
"baz";
foo("qux") ->
"qux".
In Elixir you can define multiple function clauses using macros as well. It is more verbose, but I think it is also much clearer. The Elixir equivalent would be:
defmodule MyMacros do
defmacro p2(mat, rep) do
quote do
def p2(w = unquote(mat) ++ stm) do
m_rep(0, w, stm, unquote(rep))
end
end
end
end
which you can use to define multiple function clauses, just like the erlang counterpart:
defmodule MyModule do
require MyMacros
MyMacros.p2('a', 'b')
MyMacros.p2('c', 'd')
end

I can't help myself here. :-) If it's the macros you are after then using LFE (Lisp Flavoured Erlang) gives you much better macro handling than either erlang or elixir. It also is compatible with both.

-define(p2(MAT, REP),
p2(w = MAT ++ stm) -> m_rep(0, w, stm, REP))
% where m_rep is a function already defined.
The code above has a number of issues.
There's no such thing as a macro with multiple clauses in Erlang. The above code doesn't define two separate definitions of the p2 macro that map to a private function called m_rep. What it does is it defines a 2-argument macro, which defines a p2 function taking some parameters and calling m_rep. However, the parameter definition of the internal p2 function is incorrect:
it tries to use ++ with the second argument not being a list
it tries to assign a value to an atom (did you mean a capital W, a variable, instead of a small w, an atom?)
it tries the assignment in a place where an assignment is not allowed - in a function head.
Did you try to test for equality (== instead of =), not to do an assignment? If so, you have to use a guard.
Moreover, it seems to me you're trying to use w and stm as though they were variables and pass them to m_rep, but they're not! Variables in Erlang have to start with a capital letter. Variables in Elixir, on the other hand, do not. It might be you're confusing concepts from the two similar but still different languages.
My general advice would be to pick one language and learn it well and only later with that knowledge under your belt try a different language. Pick Erlang if you're completely new to programming - it's simpler, there are less things to learn upfront. Pick Elixir if you already know Ruby or are more into immediate marketability of your skills.
Please say more about your intention and I might be able to come up with code expressing it. The above snippet is too ambiguous.

Related

Implementing Macro in a Rascal language project

Any idea on how to implement macro syntax with Rascal and also how to implement the typing and expansion(translation) of the macro syntax in Rascal? Any link to projects or repositories on this problem would also be appreciated.
Macro's are definitions of code substitutions in syntax trees, which is definitely one of the main features of Rascal. Questions I would have before advicing specific techniques:
adding macro's to an existing languages, or to a new language?
macro's at refactoring time, at compile-time or a run-time?
which would inform the question whether or not to implement macro's on concrete syntaxt trees or abstract syntax trees.
I would not say macros are a "problem" per se. The raw substitutions in syntax trees are trivial with Rascal. However, "hygienic macros" are more involved. Here we have to consider the capturing of variables by the expanded macro bodies, and what we can do about this (renaming) to avoid it. The literature on how to make macros hygienic is plenty. The complexity of hygienic macros depends on the type and name analysis (scoping) system of the base language that macros are added to.
If you have a DSL that you want to translate in stages to the target code, that can also be called "macros", but you will not find that name in the documentation. Here is an example: https://github.com/usethesource/flybytes/blob/main/src/lang/flybytes/macros/ControlFlow.rsc where "macro" is used to rewrite an additional AST node to its semantics in the "core" language.
The basic mechanisms are:
pattern matching: detects what you want to expand, with macros this is often a single ADT constructor but it can also be a more complex special case like matching i+=1 to substitute it with i++ .
substitution: at the location where the match was found, we create a new AST value in a simpler language but with the same semantics. This is done with AST expressions in Rascal, the => operator in visit and insert statements, and return and = in functions.
traversal: guiding the pattern matching and substitution without having to write to much boilerplate recursive functions.
Small example:
data Bool(loc src=|unknown:///|)
= \and(Bool l, Bool r)
| \or(Bool r, Bool r)
| \true()
| \false()
| \not(Bool a)
;
I extend the language with a "macro":
data Bool = impl(Bool l, Bool r)
A first option is to rewrite the constructor immediately and always with an overloaded function:
Bool impl(Bool l, Bool r) = or(not(l), r);
However, we lose some information here for debugging purposes, so let's try to keep the information intact:
Bool impl(Bool l, Bool r, src=loc s) = or(not(l), r, src=s);
Sometimes we want to delay the expansion for a specific stage in the compiler. In particular with the above "rewrite rule" a type-checker will not see the different anymore between ==> and || which sometimes creates usability issues with error messages.
In that case we wrap the expansion in a visit and stage it as a function:
Bool macroExpansion(Bool input) = visit(input) {
case impl(Bool l, Bool r, src=loc s) => or(not(l), r, src=s)
// add more rules here
}
It is also possible to encapsulate rewrite rules as reusable functions:
Bool expand1(impl(Bool l, Bool r, src=loc s) = or(not(l), r, src=s);
Bool expand2(not(not(Bool b))) = b;
and then pass those or apply those: (expand1 + expand2)(myBool)
So to wrap this up:
pattern matching is the key to macro expansion, patterns can be wrapped in functions or visit cases or both, and functions can be passed around and combined.
watch out to do some "origin tracking" and forward src fields to the right-hand sides of rewrite rules, otherwise the generated code does not know where it comes from.

What is the point of op_Quotation if it cannot be used?

According the F# specification for operator overloading
<# #> op_Quotation
<## ##> op_QuotationUntyped
is given as with many other operators. Unless I'm missing something I don't believe that I can use this for custom types, so why is it listed?
I think you are right that there is no way of actually using those as custom operators. I suspect those are treated as operators in case this was useful, at some point in the future of the language, for some clever new feature.
The documentation really merely explains how the names of the operators get encoded. For non-special operator names, F# encodes those in a systematic way. For the ones listed in the page, it has a special nicer name. Consider this type:
type X() =
static member (<^><>) (a:int,b:int) = a + b
static member (<# #>) (a:int,b:int) = a + b
If you look at the names of those members:
[ for m in typeof<X>.GetMembers() -> m.Name ]
You see that the first operator got compiled as op_LessHatGreaterLessGreater, while the second one as op_Quotation. So this is where the name memntioned in the table comes in - it is probably good this is documented somewhere, but I think you're right, that this is not particularly useful!

How to invoke Erlang function with variable?

4> abs(1).
1
5> X = abs.
abs
6> X(1).
** exception error: bad function abs
7> erlang:X(1).
1
8>
Is there any particular reason why I have to use the module name when I invoke a function with a variable? This isn't going to work for me because, well, for one thing it is just way too much syntactic garbage and makes my eyes bleed. For another thing, I plan on invoking functions out of a list, something like (off the top of my head):
[X(1) || X <- [abs, f1, f2, f3...]].
Attempting to tack on various module names here is going to make the verbosity go through the roof, when the whole point of what I am doing is to reduce verbosity.
EDIT: Look here: http://www.erlangpatterns.org/chain.html The guy has made some pipe-forward function. He is invoking functions the same way I want to above, but his code doesn't work when I try to use it. But from what I know, the guy is an experienced Erlang programmer - I saw him give some keynote or whatever at a conference (well I saw it online).
Did this kind of thing used to work but not anymore? Surely there is a way I can do what I want - invoke these functions without all the verbosity and boilerplate.
EDIT: If I am reading the documentation right, it seems to imply that my example at the top should work (section 8.6) http://erlang.org/doc/reference_manual/expressions.html
I know abs is an atom, not a function. [...] Why does it work when the module name is used?
The documentation explains that (slightly reorganized):
ExprM:ExprF(Expr1,...,ExprN)
each of ExprM and ExprF must be an atom or an expression that
evaluates to an atom. The function is said to be called by using the
fully qualified function name.
ExprF(Expr1,...,ExprN)
ExprF
must be an atom or evaluate to a fun.
If ExprF is an atom the function is said to be called by using the implicitly qualified function name.
When using fully qualified function names, Erlang expects atoms or expression that evaluates to atoms. In other words, you have to bind X to an atom: X = atom. That's exactly what you provide.
But in the second form, Erlang expects either an atom or an expression that evaluates to a function. Notice that last word. In other words, if you do not use fully qualified function name, you have to bind X to a function: X = fun module:function/arity.
In the expression X=abs, abs is not a function but an atom. If you want thus to define a function,you can do so:
D = fun erlang:abs/1.
or so:
X = fun(X)->abs(X) end.
Try:
X = fun(Number) -> abs(Number) end.
Updated:
After looking at the discussion more, it seems like you're wanting to apply multiple functions to some input.
There are two projects that I haven't used personally, but I've starred on Github that may be what you're looking for.
Both of these projects use parse transforms:
fun_chain https://github.com/sasa1977/fun_chain
pipeline https://github.com/stolen/pipeline
Pipeline is unique because it uses a special syntax:
Result = [fun1, mod2:fun2, fun3] (Arg1, Arg2).
Of course, it could also be possible to write your own function to do this using a list of {module, function} tuples and applying the function to the previous output until you get the result.

Lua Semicolon Conventions

I was wondering if there is a general convention for the usage of semicolons in Lua, and if so, where/why should I use them? I come from a programming background, so ending statements with a semicolon seems intuitively correct. However I was concerned as to why they are "optional" when its generally accepted that semicolons end statements in other programming languages. Perhaps there is some benefit?
For example: From the lua programming guide, these are all acceptable, equivalent, and syntactically accurate:
a = 1
b = a*2
a = 1;
b = a*2;
a = 1 ; b = a*2
a = 1 b = a*2 -- ugly, but valid
The author also mentions: Usually, I use semicolons only to separate two or more statements written in the same line, but this is just a convention.
Is this generally accepted by the Lua community, or is there another way that is preferred by most? Or is it as simple as my personal preference?
Semi-colons in Lua are generally only required when writing multiple statements on a line.
So for example:
local a,b=1,2; print(a+b)
Alternatively written as:
local a,b=1,2
print(a+b)
Off the top of my head, I can't remember any other time in Lua where I had to use a semi-colon.
Edit: looking in the lua 5.2 reference I see one other common place where you'd need to use semi-colons to avoid ambiguity - where you have a simple statement followed by a function call or parens to group a compound statement. here is the manual example located here:
--[[ Function calls and assignments can start with an open parenthesis. This
possibility leads to an ambiguity in the Lua grammar. Consider the
following fragment: ]]
a = b + c
(print or io.write)('done')
-- The grammar could see it in two ways:
a = b + c(print or io.write)('done')
a = b + c; (print or io.write)('done')
in local variable and function definition. Here I compare two quite similar sample codes to illustrate my point of view.
local f; f = function() function-body end
local f = function() function-body end
These two functions can return different results when the function-body section contains reference to variable "f".
Many programming languages (including Lua) that do not require semicolons have a convention to not use them, except for separating multiple statements on the same line.
Javascript is an important exception, which generally uses semicolons by convention.
Kotlin is also technically an exception. The Kotlin Documentation say not only not to use semicolons on non-batched statements, but also to
Omit semicolons whenever possible.
In local variable definitions, we get ambiguous results from time to time:
local a, b = string.find("hello world", "hello") --> a = nil, b = nil
while sometimes a and b are assigned the right values 7 and 11.
So I found no choice but to follow one of these two approaches:
local a, b; a, b = string.find("hello world", "hello") --> a, b = 7, 11
local a, b
a, b = string.find("hello world", "hello") --> a, b = 7, 11
For having more than one thing on a line, for example:
c=5
a=1+c
print(a) -- 6
could be shortened to:
c=5; a=1+c; print(a) -- 6
also worth noting that if you're used to Javascript, or something like that, where you have to end a line in a semicolon, and you're especially used to writing that, then this means that you won't have to remove that semicolon, and trust me, i'm used to Javascript too, and I really, really forget that you don't need the semicolon, every time I write a new line!

REBOL path operator vs division ambiguity

I've started looking into REBOL, just for fun, and as a fan of programming languages, I really like seeing new ideas and even just alternative syntaxes. REBOL is definitely full of these. One thing I noticed is the use of '/' as the path operator which can be used similarly to the '.' operator in most object-oriented programming languages. I have not programmed in REBOL extensively, just looked at some examples and read some documentation, but it isn't clear to me why there's no ambiguity with the '/' operator.
x: 4
y: 2
result: x/y
In my example, this should be division, but it seems like it could just as easily be the path operator if x were an object or function refinement. How does REBOL handle the ambiguity? Is it just a matter of an overloaded operator and the type system so it doesn't know until runtime? Or is it something I'm missing in the grammar and there really is a difference?
UPDATE Found a good piece of example code:
sp: to-integer (100 * 2 * length? buf) / d/3 / 1024 / 1024
It appears that arithmetic division requires whitespace, while the path operator requires no whitespace. Is that it?
This question deserves an answer from the syntactic point of view. In Rebol, there is no "path operator", in fact. The x/y is a syntactic element called path. As opposed to that the standalone / (delimited by spaces) is not a path, it is a word (which is usually interpreted as the division operator). In Rebol you can examine syntactic elements like this:
length? code: [x/y x / y] ; == 4
type? first code ; == path!
type? second code
, etc.
The code guide says:
White-space is used in general for delimiting (for separating symbols).
This is especially important because words may contain characters such as + and -.
http://www.rebol.com/r3/docs/guide/code-syntax.html
One acquired skill of being a REBOler is to get the hang of inserting whitespace in expressions where other languages usually do not require it :)
Spaces are generally needed in Rebol, but there are exceptions here and there for "special" characters, such as those delimiting series. For instance:
[a b c] is the same as [ a b c ]
(a b c) is the same as ( a b c )
[a b c]def is the same as [a b c] def
Some fairly powerful tools for doing introspection of syntactic elements are type?, quote, and probe. The quote operator prevents the interpreter from giving behavior to things. So if you tried something like:
>> data: [x [y 10]]
>> type? data/x/y
>> probe data/x/y
The "live" nature of the code would dig through the path and give you an integer! of value 10. But if you use quote:
>> data: [x [y 10]]
>> type? quote data/x/y
>> probe quote data/x/y
Then you wind up with a path! whose value is simply data/x/y, it never gets evaluated.
In the internal representation, a PATH! is quite similar to a BLOCK! or a PAREN!. It just has this special distinctive lexical type, which allows it to be treated differently. Although you've noticed that it can behave like a "dot" by picking members out of an object or series, that is only how it is used by the DO dialect. You could invent your own ideas, let's say you make the "russell" command:
russell [
x: 10
y: 20
z: 30
x/y/z
(
print x
print y
print z
)
]
Imagine that in my fanciful example, this outputs 30, 10, 20...because what the russell function does is evaluate its block in such a way that a path is treated as an instruction to shift values. So x/y/z means x=>y, y=>z, and z=>x. Then any code in parentheses is run in the DO dialect. Assignments are treated normally.
When you want to make up a fun new riff on how to express yourself, Rebol takes care of a lot of the grunt work. So for example the parentheses are guaranteed to have matched up to get a paren!. You don't have to go looking for all that yourself, you just build your dialect up from the building blocks of all those different types...and hook into existing behaviors (such as the DO dialect for basics like math and general computation, and the mind-bending PARSE dialect for some rather amazing pattern matching muscle).
But speaking of "all those different types", there's yet another weirdo situation for slash that can create another type:
>> type? quote /foo
This is called a refinement!, and happens when you start a lexical element with a slash. You'll see it used in the DO dialect to call out optional parameter sets to a function. But once again, it's just another symbolic LEGO in the parts box. You can ascribe meaning to it in your own dialects that is completely different...
While I didn't find any written definitive clarification, I did also find that +,-,* and others are valid characters in a word, so clearly it requires a space.
x*y
Is a valid identifier
x * y
Performs multiplication. It looks like the path operator is just another case of this.

Resources