pattern match in formal parameter of function definition - erlang

Here's something I've seen in erlang code a few times, but it's a tough thing to google and I can only find this example (the first code block in link below):
http://www.process-one.net/en/wiki/ejabberd_HTTP_request_handlers/
In the "head" of function definition of process/2
process(_LocalPath = ["world"], _Request) ->
there is a pattern match on first parameter / argument;
Does this act similarly like a guard, so the following clause will be executed only if the first argument passed to process/2 is string "world", or is "world" some kind of a default argument? Or i completely misunderstood/ mis-guessed?

Yes, this is a pattern match. The clause will be executed if the first argument is a list with a single element, the element being the string "world".

You are correct: _LocalPath = ["world"] acts as a pattern "guard". If the first parameter to the function "process" isn't equal to ["world"], then the emulator proceeds to find a match down.
One thing to note: _LocalPath serves as "decorator" to enhance readability since the identifier starts with an underscore.

The = in a pattern is used for an alias, it basically allows you to have your cake and eat it. It both does a normal pattern match and binds a variable to the whole matched data. It is practical if you need the whole data as it saves you having to reconstruct it. You can use it anywhere in a pattern. It has nothing to do with guards.
Starting a variable with a _ as in _LocalPath is too tell the compiler not to complain if this variable is not used. Normally the compiler whines a bit if you bind variables and don't use them. Apart from this there is nothing special about variables whose names start with _, you can use them as you would any variable.
The only really special variable is _, the anonymous variable. It always matches and is never bound so you can use it as an anonymous place holder. Which is why it exists in the first place.
I personally very rarely use variables starting with _ and prefer to use just _. I also feel that cluttering up patterns with unnecessary things is a Bad Thing so I wouldn't use aliases for documentation like that. I would write:
%% process(LocalPath, Request) -> ... .
process(["world"], _) ->
or perhaps a type declaration if you prefer. Keeps the code shorter and more legible, I think.

Related

Lua Terminology related to OOP

To be to the point; I've done Lua for awhile, but never quite got the terminology down to specifics, so I've been Googling for hours and haven't come up with a definitive answer.
Related to OOP in Lua, the terminology used include:
Object
Class
Function
Method
Table
The question is, when are these properly used? Such as in the example below:
addon = { }
function addon:test_func( )
return 'hi'
end
Q: From my understanding with Lua and OOP, addon is a table, however, I've read that it can be an object as well -- but when it is technically an object? After a function is created within that table?
Q: test_func is a function, however, I've read that it becomes a "Method" when it's placed within a table (class).
Q: The entire line addon:test_func( ), I know the colon is an operator, but what is the term for the entire line set of text? A class itself?
Finally, for this example code:
function addon:test_func( id, name )
end
Q: What is id and name, because I've seen some people identify them as arguments, but then other areas classify them as parameters, so I've stuck with parameters.
So in short, what is the proper terminology for each of these, and when do they become what they are?
Thanks
From my understanding with Lua and OOP, addon is a table, however, I've read that it can be an object as well -- but when it is technically an object? After a function is created within that table?
Object is not a well-defined term. I've seen it defined (in C) as any value whatsoever. In Lua, I would consider it synonymous with a table. You could also define it as an instance of a class.
test_func is a function, however, I've read that it becomes a "Method" when it's placed within a table (class).
You're basically right. A method is any function that is intended to be called with the colon notation. Metamethods are also methods, because, like regular methods, they define the behavior of tables.
The entire line addon:test_func( ), I know the colon is an operator, but what is the term for the entire line set of text? A class itself?
There's no name for that particular piece of code. It's just part of a method definition.
Also, I wouldn't call the colon an operator. An operator would be the plus in x + y where x and y both mean something by themselves. In addon:test_func(), test_func only has meaning inside the table addon, and it's only valid to use the colon when calling or defining methods. The colon is actually a form of syntactic sugar where the real operator is the indexing operator: []. Assuming that you're calling the method, the expansion would be: addon['test_func'](addon).
What is id and name, because I've seen some people identify them as arguments, but then other areas classify them as parameters, so I've stuck with parameters.
They're parameters. Parameters are the names that you declare in the function signature. Arguments are the values that you pass to a function.

When are F# function calls evaluated; lazily or immediately?

Curried functions in F#. I get the bit where passing in a subset of parameters yields a function with presets. I just wondered if passing all of the parameters is any different. For example:
let addTwo x y = x + y
let incr a = addTwo 1
let added = addTwo 2 2
incr is a function taking one argument.
Is added an int or a function? I can imagine an implementation where "added" is evaluated lazily only on use (like Schroedinger's Cat on opening the box). Is there any guarantee of when the addition is performed?
added is not a function; it is just a value that is calculated and bound to the name on the spot. A function always needs at least one parameter; if there is nothing useful to pass, that would be the unit value ():
let added () = addTwo 2 2
F# is an eagerly evaluated language, so an expression like addTwo 2 2 will immediately be evaluated to a value of the int type.
Haskell, by contrast, is lazily evaluated. An expression like addTwo 2 2 will not be evaluated until the value is needed. The type of the expression would still be a single integer, though. Even so, such an expression is, despite its laziness, not regarded as a function; in Haskell, such an unevaluated expression is called a thunk. That basically just means 'an arbitrarily complex expression that's not yet evaluated'.
incr is a function taking one argument. Is added an int or a function?
added, in this case, is a named binding that evaluates to an int. It is not a function.
I can imagine an implementation where "added" is evaluated lazily only on use (like Schroedinger's Cat on opening the box). Is there any guarantee of when the addition is performed?
The addition will be performed immediately when the binding is generated. There is no laziness involved.
As explained by TeaDrivenDev, you can change added to be a bound function instead of a bound value by adding a parameter, which can be unit:
let added () = addTwo 2 2
In this case, it will be a function, so the addition wouldn't happen until you call it:
let result = added () // Call the function, bind output to result
No. But kind of yes. But really, no.
You can construct a pure functional language that only has functions and nothing else. Lambda calculus is a complete algebra, so the theory is there. In this model, added can be considered a parameter-less function (in contrast to e.g. random(), where there's one parameter of type unit).
But F# is different. Since it's a rather pragmatic mix of imperative and functional programming, the result is not a function[1]. Instead, it's a value, just like a local in C#. This is no implementation detail - it's actually part of the F# specification. This does have disadvantages - it means its possible to have an ambiguous definition, where a definition could be either a value or a function definition (14.6.1).
[1] - Though in a pure functional program, you can't tell the difference - it's the same as just doing a substitution of the function with a cached value, which is perfectly legal.

What are the steps in doing incrementation in erlang?

increment([]) -> [];
increment([H|T]) -> [H+1|increment(T)].
decrement([]) -> [];
decrement([H|T]) -> [H-1|decrement(T)].
So I have this code but I don't know how they properly work like in java.
Java and Erlang are different beasts. I don't recommend trying to make comparisons to Java when learning Erlang, especially if Java is the only language you know so far. The code you've posted is a good example of the paradigm known as "functional programming". I'd suggest doing some reading on that subject to help you understand what's going on. To try to break this down as far as Erlang goes, you need to understand that an Erlang function is completely different from a Java method.
In Java, your method signature is composed of the method name and the types of its arguments. The return type can also be significant. A Java increment method like the function you wrote might be written like List<Integer> increment(List<Integer> input). The body of the Java method would probably iterate through the list an element at a time and set each element to itself plus one:
List<Integer> increment(List<Integer> input) {
for (int i = 0; i < input.size; i++) {
input.set(i, input.get(i) + 1);
}
}
Erlang has almost nothing in common with this. To begin with, an erlang function's "signature" is the name and arity of the function. Arity means how many arguments the function accepts. So your increment function is known as increment/1, and that's its unique signature. The way you write the argument list inside the parentheses after the function name has less to do with argument types than with the pattern of the data passed to it. A function like increment([]) -> ... can only successfully be called by passing it [], the empty list. Likewise, the function increment([Item]) -> ... can only be successfully called by passing it a list with one item in it, and increment([Item1, Item2]) -> ... must be passed a list with two items in it. This concept of matching data to patterns is quite aptly known as "pattern matching", and you'll find it in many functional languages. In Erlang functions, it's used to select which head of the function to execute. This bears a rough similarity to Java's method overloading, where you can have many methods with the same name but different argument types; however a pattern in an Erlang function head can bind variables to different pieces of the arguments that match the pattern.
In your code example, the function increment/1 has two heads. The first head is executed only if you pass an empty list to the function. The second head is executed only if you pass a non-empty list to the function. When that happens, two variables, H and T, are bound. H is bound to the first item of the list, and T is bound to the rest of the list, meaning all but the first item. That's because the pattern [H|T] matches a non-empty list, including a list with one element, in which case T would be bound to the empty list. The variables thus bound can be used in the body of the function.
The bodies of your functions are a very typical form of iterating a list in Erlang to produce a new list. It's typical because of another important difference from Java, which is that Erlang data is immutable. That means there's no such concept as "setting an element of a list" like I did in the Java code above. If you want to change a list, you have to build a new one, which is what your code does. It effectively says:
The result of incrementing the empty list is the empty list.
The result of incrementing a non-empty list is:
Take the first element of the list: H.
Increment the rest of the list: increment(T).
Prepend H+1 to the result of incrementing the rest of the list.
Note that you want to be careful about how you build lists in Erlang, or you can end up wasting a lot of resources. The List Handling User's Guide is a good place to learn about that. Also note that this code uses a concept known as "recursion", meaning that the function calls itself. In many popular languages, including Java, recursion is of limited usefulness because each new function call adds a stack frame, and your available memory space for stack frames is relatively limited. Erlang and many functional languages support a thing known as "tail call elimination", which is a feature that allows properly written code to recurse indefinitely without exhausting any resources.
Hopefully this helps explain things. If you can ask a more specific question, you might get a better answer.

DLV Rule is not safe

I am starting to work with DLV (Disjunctive Datalog) and I have a rule that is reporting a "Rule is not safe" error, when running the code. The rule is the following:
foo(R, 1) :- not foo(R, _)
I have read the manual and seen that "cyclic dependencies are disallowed". I guess this is why I am being reported the error, but I am not sure how this statement is so problematic to DLV. The final goal is to have some kind of initialization in case that the predicate has not been defined.
More precisely, if there is no occurrence of 'foo' with the parameter R (and anything else), then define it with parameters R and 1. Once it is defined, the rule shouldn't be triggered again. So, this is not a real recursion in my opinion.
Any comments on how to solve this issue are welcomed!
I have realised that I probably need another predicate to match the parameter R in the body of the rule. Something like this:
foo(R, 1) :- not foo(R, _), bar(R)
Since, otherwise there would be no way to know whether there are no occurrences of foo(R, _). I don't know whether I made myself clear.
Anyway, this doesn't work either :(
To the particular "Rule is not safe" error: First of all this has nothing to do with cyclic or acyclic dependencies. The same error message shows up for the non-cyclic program:
foo2(R, 1) :- not foo(R,_), bar(R).
The problem is that the program is actually not safe (http://www.dlvsystem.com/html/DLV_User_Manual.html#SAFETY). As mentioned in the section on negative rules (anchor #AEN375, I am only allowed to use 2 links in my answer):
Variables, which occur in a negated literal, must also occur in a
positive literal in the body.
Observe that the _ is an anonymous variable. I.e., the program
foo(R,1) :- not foo(R,_), bar(R).
can be equivalently written as (and is equivalent to)
foo(R,1) :- not foo(R,X), bar(R).
Anonymous variables (DLV manual, anchor #AEN264 - at the end of the section) just allow us to avoid inventing names for variables that will only occur once within the rule (i.e. for variables that only express "there is some value, I absolutely do not care about it), but they are variables nevertheless. And since negation with not is "negation" and not "true negation" (or "strong negation" as it is also often called), none of the three safety conditions is satisfied by the rule.
A very rough and high-level intuition for safety is that it guarantees that every variable in the program can be assigned to some finite domain - as it is now the case with R by adding bar(R). However, the same also must be the case for the anonymous variable _ .
To the actual problem of defining default values:
As pointed out by lambda.xy.x, the problem here is the Answer Set (or stable model) semantics of DLV: Trying to do it in one rule does not give any solution:
In order to get a safe program, we could replace the above problems e.g. by
foo(1,2). bar(1). bar(2).
tmp(R) :- foo(R,_).
foo(R,1) :- not tmp(R), bar(R).
This has no stable model:
Assume the answer is, as intended,
{foo(1,2), bar(1), bar(2), foo(2,1)}
However, this is not a valid model, since tmp(R) :- foo(R,_) would require it to contain tmp(2). But then, "not tmp(2)" is no longer true, and therefore having foo(2,1) in the model violates the required minimality of the model. (This is not exactly what is going on, more a rough intuition. More technical details could be found in any article on answer set programming, a quick Google search gave me this paper as one of the first results: http://www.kr.tuwien.ac.at/staff/tkren/pub/2009/rw2009-asp.pdf)
In order to solve the problem, it is therefore somehow necessary to "break the cycle". One possibility would be:
foo(1,2). bar(1). bar(2). bar(3).
tmp(R) :- foo(R,X), X!=1.
foo(R,1) :- bar(R), not tmp(R).
I.e., by explicitly stating that we want to add R into the intermediate atom only if the value is different from 1, having foo(2,1) in the model does not contradict tmp(2) not being part of the model as well. Of course, this no longer allows to distinguish whether foo(R,1) is there as default value or by input, but if this is not required ...
Another possibility would be to not use foo for the computation, but some foo1 instead. I.e. having
foo1(R,X) :- foo(R,X).
tmp(R) :- foo(R,_).
foo1(R,1) :- bar(R), not tmp(R).
and then just use foo1 instead of foo.

Why are there two kinds of functions in Elixir?

I'm learning Elixir and wonder why it has two types of function definitions:
functions defined in a module with def, called using myfunction(param1, param2)
anonymous functions defined with fn, called using myfn.(param1, param2)
Only the second kind of function seems to be a first-class object and can be passed as a parameter to other functions. A function defined in a module needs to be wrapped in a fn. There's some syntactic sugar which looks like otherfunction(&myfunction(&1, &2)) in order to make that easy, but why is it necessary in the first place? Why can't we just do otherfunction(myfunction))? Is it only to allow calling module functions without parenthesis like in Ruby? It seems to have inherited this characteristic from Erlang which also has module functions and funs, so does it actually comes from how the Erlang VM works internally?
It there any benefit having two types of functions and converting from one type to another in order to pass them to other functions? Is there a benefit having two different notations to call functions?
Just to clarify the naming, they are both functions. One is a named function and the other is an anonymous one. But you are right, they work somewhat differently and I am going to illustrate why they work like that.
Let's start with the second, fn. fn is a closure, similar to a lambda in Ruby. We can create it as follows:
x = 1
fun = fn y -> x + y end
fun.(2) #=> 3
A function can have multiple clauses too:
x = 1
fun = fn
y when y < 0 -> x - y
y -> x + y
end
fun.(2) #=> 3
fun.(-2) #=> 3
Now, let's try something different. Let's try to define different clauses expecting a different number of arguments:
fn
x, y -> x + y
x -> x
end
** (SyntaxError) cannot mix clauses with different arities in function definition
Oh no! We get an error! We cannot mix clauses that expect a different number of arguments. A function always has a fixed arity.
Now, let's talk about the named functions:
def hello(x, y) do
x + y
end
As expected, they have a name and they can also receive some arguments. However, they are not closures:
x = 1
def hello(y) do
x + y
end
This code will fail to compile because every time you see a def, you get an empty variable scope. That is an important difference between them. I particularly like the fact that each named function starts with a clean slate and you don't get the variables of different scopes all mixed up together. You have a clear boundary.
We could retrieve the named hello function above as an anonymous function. You mentioned it yourself:
other_function(&hello(&1))
And then you asked, why I cannot simply pass it as hello as in other languages? That's because functions in Elixir are identified by name and arity. So a function that expects two arguments is a different function than one that expects three, even if they had the same name. So if we simply passed hello, we would have no idea which hello you actually meant. The one with two, three or four arguments? This is exactly the same reason why we can't create an anonymous function with clauses with different arities.
Since Elixir v0.10.1, we have a syntax to capture named functions:
&hello/1
That will capture the local named function hello with arity 1. Throughout the language and its documentation, it is very common to identify functions in this hello/1 syntax.
This is also why Elixir uses a dot for calling anonymous functions. Since you can't simply pass hello around as a function, instead you need to explicitly capture it, there is a natural distinction between named and anonymous functions and a distinct syntax for calling each makes everything a bit more explicit (Lispers would be familiar with this due to the Lisp 1 vs. Lisp 2 discussion).
Overall, those are the reasons why we have two functions and why they behave differently.
I don't know how useful this will be to anyone else, but the way I finally wrapped my head around the concept was to realize that elixir functions aren't Functions.
Everything in elixir is an expression. So
MyModule.my_function(foo)
is not a function but the expression returned by executing the code in my_function. There is actually only one way to get a "Function" that you can pass around as an argument and that is to use the anonymous function notation.
It is tempting to refer to the fn or & notation as a function pointer, but it is actually much more. It's a closure of the surrounding environment.
If you ask yourself:
Do I need an execution environment or a data value in this spot?
And if you need execution use fn, then most of the difficulties become much
clearer.
I may be wrong since nobody mentioned it, but I was also under the impression that the reason for this is also the ruby heritage of being able to call functions without brackets.
Arity is obviously involved but lets put it aside for a while and use functions without arguments. In a language like javascript where brackets are mandatory, it is easy to make the difference between passing a function as an argument and calling the function. You call it only when you use the brackets.
my_function // argument
(function() {}) // argument
my_function() // function is called
(function() {})() // function is called
As you can see, naming it or not does not make a big difference. But elixir and ruby allow you to call functions without the brackets. This is a design choice which I personally like but it has this side effect you cannot use just the name without the brackets because it could mean you want to call the function. This is what the & is for. If you leave arity appart for a second, prepending your function name with & means that you explicitly want to use this function as an argument, not what this function returns.
Now the anonymous function is bit different in that it is mainly used as an argument. Again this is a design choice but the rational behind it is that it is mainly used by iterators kind of functions which take functions as arguments. So obviously you don't need to use & because they are already considered arguments by default. It is their purpose.
Now the last problem is that sometimes you have to call them in your code, because they are not always used with an iterator kind of function, or you might be coding an iterator yourself. For the little story, since ruby is object oriented, the main way to do it was to use the call method on the object. That way, you could keep the non-mandatory brackets behaviour consistent.
my_lambda.call
my_lambda.call()
my_lambda_with_arguments.call :h2g2, 42
my_lambda_with_arguments.call(:h2g2, 42)
Now somebody came up with a shortcut which basically looks like a method with no name.
my_lambda.()
my_lambda_with_arguments.(:h2g2, 42)
Again, this is a design choice. Now elixir is not object oriented and therefore call not use the first form for sure. I can't speak for José but it looks like the second form was used in elixir because it still looks like a function call with an extra character. It's close enough to a function call.
I did not think about all the pros and cons, but it looks like in both languages you could get away with just the brackets as long as you make brackets mandatory for anonymous functions. It seems like it is:
Mandatory brackets VS Slightly different notation
In both cases you make an exception because you make both behave differently. Since there is a difference, you might as well make it obvious and go for the different notation. The mandatory brackets would look natural in most cases but very confusing when things don't go as planned.
Here you go. Now this might not be the best explanation in the world because I simplified most of the details. Also most of it are design choices and I tried to give a reason for them without judging them. I love elixir, I love ruby, I like the function calls without brackets, but like you, I find the consequences quite misguiding once in a while.
And in elixir, it is just this extra dot, whereas in ruby you have blocks on top of this. Blocks are amazing and I am surprised how much you can do with just blocks, but they only work when you need just one anonymous function which is the last argument. Then since you should be able to deal with other scenarios, here comes the whole method/lambda/proc/block confusion.
Anyway... this is out of scope.
I've never understood why explanations of this are so complicated.
It's really just an exceptionally small distinction combined with the realities of Ruby-style "function execution without parens".
Compare:
def fun1(x, y) do
x + y
end
To:
fun2 = fn
x, y -> x + y
end
While both of these are just identifiers...
fun1 is an identifier that describes a named function defined with def.
fun2 is an identifier that describes a variable (that happens to contain a reference to function).
Consider what that means when you see fun1 or fun2 in some other expression? When evaluating that expression, do you call the referenced function or do you just reference a value out of memory?
There's no good way to know at compile time. Ruby has the luxury of introspecting the variable namespace to find out if a variable binding has shadowed a function at some point in time. Elixir, being compiled, can't really do this. That's what the dot-notation does, it tells Elixir that it should contain a function reference and that it should be called.
And this is really hard. Imagine that there wasn't a dot notation. Consider this code:
val = 5
if :rand.uniform < 0.5 do
val = fn -> 5 end
end
IO.puts val # Does this work?
IO.puts val.() # Or maybe this?
Given the above code, I think it's pretty clear why you have to give Elixir the hint. Imagine if every variable de-reference had to check for a function? Alternatively, imagine what heroics would be necessary to always infer that variable dereference was using a function?
There's an excellent blog post about this behavior: link
Two types of functions
If a module contains this:
fac(0) when N > 0 -> 1;
fac(N) -> N* fac(N-1).
You can’t just cut and paste this into the shell and get the same
result.
It’s because there is a bug in Erlang. Modules in Erlang are sequences
of FORMS. The Erlang shell evaluates a sequence of
EXPRESSIONS. In Erlang FORMS are not EXPRESSIONS.
double(X) -> 2*X. in an Erlang module is a FORM
Double = fun(X) -> 2*X end. in the shell is an EXPRESSION
The two are not the same. This bit of silliness has been Erlang
forever but we didn’t notice it and we learned to live with it.
Dot in calling fn
iex> f = fn(x) -> 2 * x end
#Function<erl_eval.6.17052888>
iex> f.(10)
20
In school I learned to call functions by writing f(10) not f.(10) -
this is “really” a function with a name like Shell.f(10) (it’s a
function defined in the shell) The shell part is implicit so it should
just be called f(10).
If you leave it like this expect to spend the next twenty years of
your life explaining why.
Elixir has optional braces for functions, including functions with 0 arity. Let's see an example of why it makes a separate calling syntax important:
defmodule Insanity do
def dive(), do: fn() -> 1 end
end
Insanity.dive
# #Function<0.16121902/0 in Insanity.dive/0>
Insanity.dive()
# #Function<0.16121902/0 in Insanity.dive/0>
Insanity.dive.()
# 1
Insanity.dive().()
# 1
Without making a difference between 2 types of functions, we can't say what Insanity.dive means: getting a function itself, calling it, or also calling the resulting anonymous function.
fn -> syntax is for using anonymous functions. Doing var.() is just telling elixir that I want you to take that var with a func in it and run it instead of referring to the var as something just holding that function.
Elixir has a this common pattern where instead of having logic inside of a function to see how something should execute, we pattern match different functions based on what kind of input we have. I assume this is why we refer to things by arity in the function_name/1 sense.
It's kind of weird to get used to doing shorthand function definitions (func(&1), etc), but handy when you're trying to pipe or keep your code concise.
In elixir we use def for simply define a function like we do in other languages.
fn creates an anonymous function refer to this for more clarification
Only the second kind of function seems to be a first-class object and can be passed as a parameter to other functions. A function defined in a module needs to be wrapped in a fn. There's some syntactic sugar which looks like otherfunction(myfunction(&1, &2)) in order to make that easy, but why is it necessary in the first place? Why can't we just do otherfunction(myfunction))?
You can do otherfunction(&myfunction/2)
Since elixir can execute functions without the brackets (like myfunction), using otherfunction(myfunction)) it will try to execute myfunction/0.
So, you need to use the capture operator and specify the function, including arity, since you can have different functions with the same name. Thus, &myfunction/2.

Resources