Elixir Enum vs Erlang Lists - erlang

Is there a reason to use enum library over erlang lists when writing code in Elixir? lists has many of the same functions, like takewhile, partition, any, all...
also elixir-lang.org states
"Elixir provides excellent interoperability with Erlang libraries. In fact, Elixir discourages simply wrapping Erlang libraries in favor of directly interfacing with Erlang code."
Is it because the condition function needs to be formated differently that Enum is used over Lists?

So, Enum is a module that holds functions that work with data structures implementing the Enumerable protocol.
Lists in Erlang/Elixir are Linked Lists, which have several properties in terms of algorithmic complexity when it comes to indexing and inserting.
Enumerables in Elixir are data structures such as lists, but also maps and mapsets.
However, as stated in the documentation:
the majority of the functions in Enum enumerate the whole enumerable and return a list as result
That being said, two reasons to use Enum functions over List-based functions is that basically:
There are more of them, and will provide you with very useful ways to tackle a problem
They are polymorphic. Change your map for a keyword list any time and don't waste time rewriting everything in your codebase.

Related

Is repr(C) a preprocessor directive?

I've seen some Rust codebases use the #[repr(C)] macro (is that what it's called?), however, I couldn't find much information about it but that it sets the type layout in memory to the same layout as 'C's.
Here's what I would like to know: is this a preprocessor directive restricted to the compiler and not the language itself (even though there aren't any other compiler front-ends for Rust), and why does Rust even have a memory layout different than that of Cs? (it's just that I've never had to do this in another language).
Here's a nice situation to demonstrate what I meant: if someone creates another compiler for Rust, are they required to implement this macro, or is it a compiler specific thing?
#[repr(C)] is not a preprocessor directive, since Rust doesn't use a preprocessor 1. It is an attribute. Rust doesn't have a complete specification, but the repr attribute is mentioned in the Rust reference, so it is absolutely a part of the language. Implementation-wise, attributes are parsed the same way all other Rust code is, and are stored in the same AST. Rust has no "attribute pass": attributes are an actual part of the language. If someone else were to implement a Rust compiler, they would need to implement #[repr(C)].
Furthermore, #[repr(C)] can't be implemented without some compiler magic. In the absence of a #[repr(...)], Rust compilers are free to arrange the fields of a struct/enum however they want to (and they do take advantage of this for optimization purposes!).
Rust does have a good reason for using it's own memory layout. If compilers aren't tied to how a struct is written in the source code, they can do optimisations like not storing struct fields that are never read from, reordering fields for better performance, enum tag pooling2, and using spare bits throughout NonZero*s in the struct to store data (the last one isn't happening yet, but might in the future). But the main reason is that Rust has things that just don't make sense in C. For instance, Rust has zero-sized types (like () and [i8; 0]) which can't exist in C, trait vtables, enums with fields, generic types, all of which cause problems when trying to translate them to C.
1 Okay, you could use the C preprocessor with Rust if you really wanted to. Please don't.
2 For example, enum Food { Apple, Pizza(Topping) } enum Topping { Pineapple, Mushroom, Garlic } can be stored in just 1 byte since there are only 4 possible Food values that can be created.
What is this?
It is not a macro it is an attribute.
The book has a good chapter on what macros are and it mentions that there are "Attribute-like macros":
The term macro refers to a family of features in Rust: declarative macros with macro_rules! and three kinds of procedural macros:
Custom #[derive] macros that specify code added with the derive attribute used on structs and enums
Attribute-like macros that define custom attributes usable on any item
Function-like macros that look like function calls but operate on the tokens specified as their argument
Attribute-like macros are what you could use like attributes. For example:
#[route(GET, "/")]
fn index() {}
It does look like the repr attribute doesn't it 😃
So what is an attribute then?
Luckily Rust has great resources like rust-by-example which includes:
An attribute is metadata applied to some module, crate or item. This metadata can be used to/for:
conditional compilation of code
set crate name, version and type (binary or library)
disable lints (warnings)
enable compiler features (macros, glob imports, etc.)
link to a foreign library
mark functions as unit tests
mark functions that will be part of a benchmark
The rust reference is also something you usually look at when you need to know something more in depth. (chapter for attributes)
To the compiler authors out there:
If you were to write a rust compiler, and wanted to support things like the standard library or other crates then you would 100% need to implement these. Because the libraries use these and need them.
Otherwise I guess you could come up with a subset of rust that your compiler supports. But then most people wouldn't use it..
Why does rust not just use the C layout?
The nomicon explains why rust needs to be able to reorder fields of structs for example. For reasons of saving space and being more efficient. It is related to, among other things, generics and monomorphization. In repr(C) fields of structs must be in the same order as the definition.
The C representation is designed for dual purposes. One purpose is for creating types that are interoperable with the C Language. The second purpose is to create types that you can soundly perform operations on that rely on data layout such as reinterpreting values as a different type.

Is a 'standard module' part of the 'programming language'?

On page 57 of the book "Programming Erlang" by Joe Armstrong (2007) 'lists:map/2' is mentioned in the following way:
Virtually all the modules that I write use functions like
lists:map/2 —this is so common that I almost consider map
to be part of the Erlang language. Calling functions such
as map and filter and partition in the module lists is extremely
common.
The usage of the word 'almost' got me confused about what the difference between Erlang as a whole and the Erlang language might be, and if there even is a difference at all. Is my confusion based on semantics of the word 'language'? It seems to me as if a standard module floats around the borders of what does and does not belong to the actual language it's implemented in. What are the differences between a programming language at it's core and the standard libraries implemented in them?
I'm aware of the fact that this is quite the newby question, but in my experience jumping to my own conclusions can lead to bad things. I was hoping someone could clarify this somewhat.
Consider this simple program:
1> List = [1, 2, 3, 4, 5].
[1,2,3,4,5]
2> Fun = fun(X) -> X*2 end.
#Fun<erl_eval.6.50752066>
3> lists:map(Fun, List).
[2,4,6,8,10]
4> [Fun(X) || X <- List].
[2,4,6,8,10]
Both produce the same output, however the first one list:map/2 is a library function, and the second one is a language construct at its core, called list comprehension. The first one is implemented in Erlang (accidentally also using list comprehension), the second one is parsed by Erlang. The library function can be optimized only as much as the compiler is able to optimize its implementation in Erlang. However, the list comprehension may be optimized as far as being written in assembler in the Beam VM and called from the resulted beam file for maximum performance.
Some language constructs look like they are part of the language, whereas in fact they are implemented in the library, for example spawn/3. When it's used in the code it looks like a keyword, but in Erlang it's not one of the reserved words. Because of that, Erlang compiler will automatically add the erlang module in front of it and call erlang:spawn/3, which is a library function. Those functions are called BIFs (Build-In Functions).
In general, what belongs to the language itself is what that language's compiler can parse and translate to the executable code (or in other words, what's defined by the language's grammar). Everything else is a library. Libraries are usually written in the language for which they are designed, but it doesn't necessarily have to be the case, e.g. some of Erlang library functions are written using C as Erlang NIFs.

Linked list single class vs multiple classes

In my second term as an computer science student almost the whole term we have focused on writing linked lists in different variations(stack, queue, ...). The design of these lists always came down to this
class List<T> {
class ListElement {
T value;
ListElement next;
}
ListElement root;
}
with variations to which methods were implemented and how they worked (I have left out constructors and properties for simplicity here).
Some day I started learning scala and focusing on functional programming. This also came to the point where a linked list was written but in a different style of implementation.
class List[T]( head: T, tail: List[T])
Despite the different syntax and immutability this is in my opinion a different aproach.
And I thought to myself "Well you could have implemented lists the same way in C# or Java with one class less than the aproach you learned".
I can see why you would implement a linked list like that in a functional language where recursion is not as dangerous as in C# or Java because at least for my way of thinking a recursive implementation of all the usual methods on a linked list for this design is very intuitive.
What I do not understand is why are linked lists in C# or Java typically implemented in the first fashion when you could implement them the other way with less code but equal verbosity? (I am not talking about the implementation of lists in the libraries of the language but about the lists you typically write as a programmer to be)
The only benefit I can see with the first approach is that you can hide the implementation from the user a bit better but is this the reason and also is this worth the additional class?
I wouldn't even need to expose my implementation to the user as I could still implement my list internally different and maybe only have chosen to have a constructor like that and provide functionality to retreive the first element of the list as head and also the rest as tail.
The reasons for them to be "implemented in the first fashion" as you mentioned include
Performance.
Time and space complexity are the two most important concerns while writing algorithms or implementing data structures that support operations like search and sort. As you have mentioned, the lists created the recursive way aren't mutable! The very purpose of creating a list is attaining faster operations on that. So designers prefer the 'first fashion'.
Object orientation
While solving real world problems, the initial object oriented analysis and design (OOAD) matter a lot. With an object modelling that closely resembles the real world objects/things as much as they can, designers can achieve better solutions. The recursive approach seems to miss out this aspect
Scalability
Designers of APIs/Libraries keep scalability in mind when they draft the designs. A code written in the 'first fashion' is much more scalable, and easy to comprehend.
Other design concerns
This is not an exhaustive list of the reasons in any way. There are so many other factors and experience based learning that exist in the programming folklore, that lead to the choice of the first fashion.

What are the postfix numbers on F# core methods?

I was looking at the source code for the Append function in the SeqModule and noticed that there are a ton of duplicate methods with #xxx postfixed to them. Does anyone know why these are here?
In short, those are the concrete classes that back various local function values, and the #xxx values indicate the source code line number that caused them to be generated (though this is an implementation detail, and the classes could be given any arbitrary name).
Likewise, the C# compiler uses a conceptually similar scheme when defining classes to implement anonymous delegates, iterator state machines, etc. (see Eric Lippert's answer here for how the "magic names" in C# work).
These schemes are necessary because not every language feature maps perfectly to things that can be expressed cleanly in the CLR.

IEnumerable<T> in OCaml

I use F# a lot. All the basic collections in F# implement IEumberable interface, thus it is quite natural to access them using the single Seq module in F#. Is this possible in OCaml?
The other question is that 'a seq in F# is lazy, e.g. I can create a sequence from 1 to 100 using {1..100} or more verbosely:
seq { for i=1 to 100 do yield i }
In OCaml, I find myself using the following two methods to work around with this feature:
generate a list:
let rec range a b =
if a > b then []
else a :: range (a+1) b;;
or resort to explicit recursive functions.
The first generates extra lists. The second breaks the abstraction as I need to operate on the sequence level using higher order functions such as map and fold.
I know that the OCaml library has Stream module. But the functionality of it seems to be quite limited, not as general as 'a seq in F#.
BTW, I am playing Project Euler problems using OCaml recently. So there are quite a few sequences operations, that in an imperative language would be loops with a complex body.
This Ocaml library seems to offer what you are asking. I've not used it though.
http://batteries.forge.ocamlcore.org/
Checkout this module, Enum
http://batteries.forge.ocamlcore.org/doc.preview:batteries-beta1/html/api/Enum.html
I somehow feel Enum is a much better name than Seq. It eliminates the lowercase/uppercase confusion on Seqs.
An enumerator, seen from a functional programming angle, is exactly a fold function. Where a class would implement an Enumerable interface in an object-oriented data structures library, a type comes with a fold function in a functional data structure library.
Stream is a slightly quirky imperative lazy list (imperative in that reading an element is destructive). CamlP5 comes with a functional lazy list library, Fstream. This already cited thread offers some alternatives.
It seems you are looking for something like Lazy lists.
Check out this SO question
The Batteries library mentioned also provides the (--) operator:
# 1 -- 100;;
- : int BatEnum.t = <abstr>
The enum is not evaluated until you traverse the items, so it provides a feature similar to your second request. Just beware that Batteries' enums are mutable. Ideally, there would also be a lazy list implementation that all data structures could be converted to/from, but that work has not been done yet.

Resources