We can make a nested list in erlang by writing something like this:
NL = [[2,3], [1]].
[[2,3],[1]]
But assume we wrote it like this instead:
OL = [[2,3]|1].
[[2,3]|1]
Is OL still a list? Can someone please elaborate more what OL is?
This is called an improper list and should typically not be used. I think most library functions expects proper lists (e.g. length([1|2]) throws bad argument exception). Pattern matching with improper lists works though.
For some use cases, see Practical use of improper lists in Erlang (perhaps all functional languages)
More information about | and building list is given in Functional Programming: what is an "improper list"? .
Related
On page 57 of the book "Programming Erlang" by Joe Armstrong (2007) 'lists:map/2' is mentioned in the following way:
Virtually all the modules that I write use functions like
lists:map/2 —this is so common that I almost consider map
to be part of the Erlang language. Calling functions such
as map and filter and partition in the module lists is extremely
common.
The usage of the word 'almost' got me confused about what the difference between Erlang as a whole and the Erlang language might be, and if there even is a difference at all. Is my confusion based on semantics of the word 'language'? It seems to me as if a standard module floats around the borders of what does and does not belong to the actual language it's implemented in. What are the differences between a programming language at it's core and the standard libraries implemented in them?
I'm aware of the fact that this is quite the newby question, but in my experience jumping to my own conclusions can lead to bad things. I was hoping someone could clarify this somewhat.
Consider this simple program:
1> List = [1, 2, 3, 4, 5].
[1,2,3,4,5]
2> Fun = fun(X) -> X*2 end.
#Fun<erl_eval.6.50752066>
3> lists:map(Fun, List).
[2,4,6,8,10]
4> [Fun(X) || X <- List].
[2,4,6,8,10]
Both produce the same output, however the first one list:map/2 is a library function, and the second one is a language construct at its core, called list comprehension. The first one is implemented in Erlang (accidentally also using list comprehension), the second one is parsed by Erlang. The library function can be optimized only as much as the compiler is able to optimize its implementation in Erlang. However, the list comprehension may be optimized as far as being written in assembler in the Beam VM and called from the resulted beam file for maximum performance.
Some language constructs look like they are part of the language, whereas in fact they are implemented in the library, for example spawn/3. When it's used in the code it looks like a keyword, but in Erlang it's not one of the reserved words. Because of that, Erlang compiler will automatically add the erlang module in front of it and call erlang:spawn/3, which is a library function. Those functions are called BIFs (Build-In Functions).
In general, what belongs to the language itself is what that language's compiler can parse and translate to the executable code (or in other words, what's defined by the language's grammar). Everything else is a library. Libraries are usually written in the language for which they are designed, but it doesn't necessarily have to be the case, e.g. some of Erlang library functions are written using C as Erlang NIFs.
We can make a nested list in erlang by writing something like this:
NL = [[2,3], [1]].
[[2,3],[1]]
But assume we wrote it like this instead:
OL = [[2,3]|1].
[[2,3]|1]
Is OL still a list? Can someone please elaborate more what OL is?
This is called an improper list and should typically not be used. I think most library functions expects proper lists (e.g. length([1|2]) throws bad argument exception). Pattern matching with improper lists works though.
For some use cases, see Practical use of improper lists in Erlang (perhaps all functional languages)
More information about | and building list is given in Functional Programming: what is an "improper list"? .
I've been watching an interesting video in which type classes in Haskell are used to solve the so-called "expression problem". About 15 minutes in, it shows how type classes can be used to "open up" a datatype based on a discriminated union for extension -- additional discriminators can be added separately without modifying / rebuilding the original definition.
I know type classes aren't available in F#, but is there a way using other language features to achieve this kind of extensibility? If not, how close can we come to solving the expression problem in F#?
Clarification: I'm assuming the problem is defined as described in the previous video
in the series -- extensibility of the datatype and operations on the datatype with the features of code-level modularization and separate compilation (extensions can be deployed as separate modules without needing to modify or recompile the original code) as well as static type safety.
As Jörg pointed out in a comment, it depends on what you mean by solve. If you mean solve including some form of type-checking that the you're not missing an implementation of some function for some case, then F# doesn't give you any elegant way (and I'm not sure if the Haskell solution is elegant). You may be able to encode it using the SML solution mentioned by kvb or maybe using one of the OO based solutions.
In reality, if I was developing a real-world system that needs to solve the problem, I would choose a solution that doesn't give you full checking, but is much easier to use.
A sketch would be to use obj as the representation of a type and use reflection to locate functions that provide implementation for individual cases. I would probably mark all parts using some attribute to make checking easier. A module adding application to an expression might look like this:
[<Extends("Expr")>] // Specifies that this type should be treated as a case of 'Expr'
type App = App of obj * obj
module AppModule =
[<Implements("format")>] // Specifies that this extends function 'format'
let format (App(e1, e2)) =
// We don't make recursive calls directly, but instead use `invoke` function
// and some representation of the function named `formatFunc`. Alternatively
// you could support 'e1?format' using dynamic invoke.
sprintfn "(%s %s)" (invoke formatFunc e1) (invoke formatFunc e2)
This does not give you any type-checking, but it gives you a fairly elegant solution that is easy to use and not that difficult to implement (using reflection). Checking that you're not missing a case is not done at compile-time, but you can easily write unit tests for that.
See Vesa Karvonen's comment here for one SML solution (albeit cumbersome), which can easily be translated to F#.
I know type classes aren't available in F#, but is there a way using other language features to achieve this kind of extensibility?
I do not believe so, no.
If not, how close can we come to solving the expression problem in F#?
The expression problem is about allowing the user to augment your library code with both new functions and new types without having to recompile your library. In F#, union types make it easy to add new functions (but impossible to add new union cases to an existing union type) and class types make it easy to derive new class types (but impossible to add new methods to an existing class hierarchy). These are the two forms of extensibility required in practice. The ability to extend in both directions simultaneously without sacrificing static type safety is just an academic curiosity, IME.
Incidentally, the most elegant way to provide this kind of extensibility that I have seen is to sacrifice type safety and use so-called "rule-based programming". Mathematica does this. For example, a function to compute the symbolic derivative of an expression that is an integer literal, variable or addition may be written in Mathematica like this:
D[_Integer, _] := 0
D[x_Symbol, x_] := 1
D[_Symbol, _] := 0
D[f_ + g_, x_] := D[f, x] + D[g, x]
We can retrofit support for multiplication like this:
D[f_ g_, x_] := f D[g, x] + g D[f, x]
and we can add a new function to evaluate an expression like this:
E[n_Integer] := n
E[f_ + g_] = E[f] + E[g]
To me, this is far more elegant than any of the solutions written in languages like OCaml, Haskell and Scala but, of course, it is not type safe.
I use F# a lot. All the basic collections in F# implement IEumberable interface, thus it is quite natural to access them using the single Seq module in F#. Is this possible in OCaml?
The other question is that 'a seq in F# is lazy, e.g. I can create a sequence from 1 to 100 using {1..100} or more verbosely:
seq { for i=1 to 100 do yield i }
In OCaml, I find myself using the following two methods to work around with this feature:
generate a list:
let rec range a b =
if a > b then []
else a :: range (a+1) b;;
or resort to explicit recursive functions.
The first generates extra lists. The second breaks the abstraction as I need to operate on the sequence level using higher order functions such as map and fold.
I know that the OCaml library has Stream module. But the functionality of it seems to be quite limited, not as general as 'a seq in F#.
BTW, I am playing Project Euler problems using OCaml recently. So there are quite a few sequences operations, that in an imperative language would be loops with a complex body.
This Ocaml library seems to offer what you are asking. I've not used it though.
http://batteries.forge.ocamlcore.org/
Checkout this module, Enum
http://batteries.forge.ocamlcore.org/doc.preview:batteries-beta1/html/api/Enum.html
I somehow feel Enum is a much better name than Seq. It eliminates the lowercase/uppercase confusion on Seqs.
An enumerator, seen from a functional programming angle, is exactly a fold function. Where a class would implement an Enumerable interface in an object-oriented data structures library, a type comes with a fold function in a functional data structure library.
Stream is a slightly quirky imperative lazy list (imperative in that reading an element is destructive). CamlP5 comes with a functional lazy list library, Fstream. This already cited thread offers some alternatives.
It seems you are looking for something like Lazy lists.
Check out this SO question
The Batteries library mentioned also provides the (--) operator:
# 1 -- 100;;
- : int BatEnum.t = <abstr>
The enum is not evaluated until you traverse the items, so it provides a feature similar to your second request. Just beware that Batteries' enums are mutable. Ideally, there would also be a lazy list implementation that all data structures could be converted to/from, but that work has not been done yet.
In Erlang, you are encouraged not to match patterns that you do not actually handle. For example:
case (anint rem 10) of
1 -> {ok, 10}
9 -> {ok, 25}
end;
is a style that is encouraged, with other possible results resulting in a badmatch result. This is consistant with the "let it crash" philosophy in Erlang.
On the other hand, F# would issue an "incomplete pattern matching" in the equivalent F# code, like here.
The question: why wouldn't F# remove the warning, effectively by augmenting every pattern matching with a statement equivalent to
|_ -> failwith "badmatch"
and use the "let it crash" philosophy?
Edit: Two interesting answers so far: either to avoid bugs that are likely when not handling all cases of an algebraic datatype; or because of the .Net platform. One way to find out which is to check OCaml. So, what is the default behaviour in OCaml?
Edit: To remove misunderstanding by .Net people who have no background in Erlang. The point of the Erlang philosophy is not to produce bad code that always crashes. Let it crash means let some other process fix the error. Instead of writing the function so that it can handle all possible cases, let the caller (for example) handle the bad cases which are thrown automatically. For those with Java background, it is like the difference between having a language with checked exceptions which must declare everything it will possibly return with every possible exception, and having a language in which functions may raise exceptions that are not explicitly declared.
F# (and other languages with pattern matching, like Haskell and O'Caml) does implicitly add a case that throws an exception.
In my opinion the most valuable reason for having complete pattern matches and paying attention to the warning, is that it makes it easy to refactor by extending your datatype, because the compiler will then warn you about code you haven't yet updated with the new case.
On the other hand, sometimes there genuinely are cases that should be left out, and then it's annoying to have to put in a catch-all case with what is often a poor error message. So it's a trade-off.
In answer to your edit, this is also a warning by default in O'Caml (and in Haskell with -Wall).
In most cases, particularly with algebraic datatypes, forgetting a case is likely to be an accident and not an intentional decision to ignore a case. In strongly typed functional languages, I think that most functions will be total, and should therefore handle every case. Even for partial functions, it's often ideal to throw a specific exception rather than to use a generic pattern matching failure (e.g. List.head throws an ArgumentException when given an empty list).
Thus, I think that it generally makes sense for the compiler to warn the developer. If you don't like this behavior, you can either add a catch-all pattern which itself throws an exception, or turn off or ignore the warning.
why wouldn't F# remove the warning
Interesting that you would ask this. Silently injecting sources of run-time error is absolutely against the philosophy behind F# and its relatives. It is considered to be a grotesque abomination. This family of languages are all about static checking, to the extent that the type system was fundamentally designed to facilitate exactly these kinds of static checks.
This stark difference in philosophy is precisely why F# and Python are so rarely compared and contrasted. "Never the twain shall meet" as they say.
So, what is the default behaviour in OCaml?
Same as F#: exhaustiveness and redundancy of pattern matches is checked at compile time and a warning is issued if a match is found to be suspect. Idiomatic style is also the same: you are expected to write your code such that these warnings do not appear.
This behaviour has nothing to do with .NET and, in fact, this functionality (from OCaml) was only implemented properly in F# quite recently.
For example, if you use a pattern in a let binding to extract the first element of a list because you know the list will always have at least one element:
let x::_ = myList
In this family of languages, that is almost always indicative of a design flaw. The correct solution is to represent your non-empty list using a type that makes it impossible to represent the empty list. Static type checking then proves that your list cannot be empty and, therefore, guarantees that this source of run-time errors has been completely eliminated from your code.
For example, you can represent a non-empty list as a tuple containing the head and the tail list. Your pattern match then becomes:
let x, _ = myList
This is exhaustive so the compiler is happy and does not issue a warning. This code cannot go wrong at run-time.
I became an advocate of this technique back in 2004, when I refactored about 1kLOC of commercial OCaml code that had been a major source of run-time errors in an application (even though they were explicit in the form of catch-all match cases that raised exceptions). My refactoring removed all of the sources of run-time errors from most the code. The reliability of the entire application improved enormously. Moreover, we had wasted weeks hunting bugs via debugging but my refactoring was completed within 2 days. So this technique really does pay dividends in the real world.
Erlang cannot have exhaustive pattern matching because of dynamic types unless you have a catch-all in every, which is just silly. Ocaml, on the other hand, can. Ocaml also tries to push all issues that can be caught at compile-time to compile-time.
OCaml by default does warn you about incomplete matches. You can disable it by adding "p" to the "-w" flag. The idea with these warnings is that more often than not (at least in my experience) they are an indication of programmer error. Especially when all your patterns are really complex like Node (Node (Leaf 4) x) (Node y (Node Empty _)), it is easy to miss a case. When the programmer is sure that it cannot go wrong, explicitly adding a | _ -> assert false case is an unambiguous way to indicate that.
GHC by default turns off these warnings; but you can enable it with -fwarn-incomplete-patterns