Functions vs methods - f#

I'm very new to F#. One of the first things I noticed was that collection operations are defined as functions rather than as methods.
As an experiment, I defined a couple of methods on list:
type List<'a> with
member this.map f = List.map f this
member this.filter f = List.filter f this
Then, given these helpers:
let square x = x * x
let isEven n = n % 2 = 0
here's an example of using the methods:
[1 .. 10].filter(isEven).map(square)
And here's the traditional way:
[1 .. 10] |> List.filter isEven |> List.map square
So concision clearly wasn't a reason to choose functions over methods. :-)
From a library design perspective, why were functions chosen over methods?
My guess is that it's because you can pass List.filter around, but can't really pass the filter method around unless it's "tied" to a list or wrapped in an anonymous function (i.e. (fun (ls : 'a list) -> ls.filter) effectively turns the filter method back into a function on list).
However, even with that reason, it seems the most common case of invoking an operation directly would give favor to methods since they are more concise. So I'm wondering if there's another reason.
Edit:
My second guess is function specialization. I.e. it's straightforward to specialize List.filter (e.g. let evens List.filter isEven). It seems more verbose to have to define an evens method.

What functions have over methods is function specialization and the easy factoring it enables.
Here's the example expression involving functions:
let square x = x * x
let isEven n = n % 2 = 0
[1 .. 10] |> List.filter isEven |> List.map square
Let's factor out a function called evens for filtering evens:
let evens = List.filter isEven
And now let's factor out a function which squares a list of ints:
let squarify = List.map square
Our original expression is now:
[1 .. 10] |> evens |> squarify
Now let's go back to the original method based expression:
[1 .. 10].filter(isEven).map(square)
Factoring out a filter on evens isn't as trivial in this case.

I think your guess is correct. Concision aside, being able to treat List.filter as a first class thing that may be passed around (your first guess) and partially applied (your second guess) is key. It's a verb- rather than noun-oriented way of looking at the world. I think Steve Yegge said it best :)

Related

F# pipe operator confusion

I am learning F# and the use cases of the |>, >>, and << operators confuse me. I get that everything if statements, functions, etc. act like variables but how do these work?
Usually we (community) say the Pipe Operator |> is just a way, to write the last argument of a function before the function call. For example
f x y
can be written
y |> f x
but for correctness, this is not true. It just pass the next argument to a function. So you could even write.
y |> (x |> f)
All of this, and all other kind of operators works, because in F# all functions are curried by default. This means, there exists only functions with one argument. Functions with many arguments, are implemented that a functions return another function.
You could also write
(f x) y
for example. The function f is a function that takes x as argument and returns another function. This then gets y passed as an argument.
This process is automatically done by the language. So if you write
let f x y z = x + y + z
it is the same as:
let f = fun x -> fun y -> fun z -> x + y + z
Currying is by the way the reason why parenthesis in a ML-like language are not enforced compared to a LISP like language. Otherwise you would have needded to write:
(((f 1) 2) 3)
to execute a function f with three arguments.
The pipe operator itself is just another function, it is defined as
let (|>) x f = f x
It takes a value x as its first argument. And a function f as its second argument. Because operators a written "infix" (this means between two operands) instead of "prefix" (before arguments, the normal way), this means its left argument to the operator is the first argument.
In my opinion, |> is used too much by most F# people. It makes sense to use piping if you have a chain of operations, one after another. Typically for example if you have multiple list operations.
Let's say, you want to square all numbers in a list and then filter only the even ones. Without piping you would write.
List.filter isEven (List.map square [1..10])
Here the second argument to List.filter is a list that is returned by List.map. You can also write it as
List.map square [1..10]
|> List.filter isEven
Piping is Function application, this means, you will execute/run a function, so it computes and returns a value as its result.
In the above example List.map is first executed, and the result is passed to List.filter. That's true with piping and without piping. But sometimes, you want to create another function, instead of executing/running a function. Let's say you want to create a function, from the above. The two versions you could write are
let evenSquares xs = List.filter isEven (List.map square xs)
let evenSquares xs = List.map square xs |> List.filter isEven
You could also write it as function composition.
let evenSquares = List.filter isEven << List.map square
let evenSquares = List.map square >> List.filter isEven
The << operator resembles function composition in the "normal" way, how you would write a function with parenthesis. And >> is the "backwards" compositon, how it would be written with |>.
The F# documentation writes it the other way, what is backward and forward. But i think the F# language creators are wrong.
The function composition operators are defined as:
let (<<) f g x = f (g x)
let (>>) f g x = g (f x)
As you see, the operator has technically three arguments. But remember currying. When you write f << g, then the result is another functions, that expects the last argument x. Passing less arguments then needed is also often called Partial Application.
Function composition is less often used in F#, because the compiler sometimes have problems with type inference if the function arguments are generic.
Theoretically you could write a program without ever defining a variable, just through function composition. This is also named Point-Free style.
I would not recommend it, it often makes code harder to read and/or understand. But it is sometimes used if you want to pass a function to another
Higher-Order function. This means, a functions that take another function as an argument. Like List.map, List.filter and so on.
Pipes and composition operators have simple definition but are difficult to grasp. But once we have understand them, they are super useful and we miss them when we get back to C#.
Here some explanations but you get the best feedbacks from your own experiments. Have fun!
Pipe right operator |>
val |> fn ≡ fn val
Utility:
Building a pipeline, to chain calls to functions: x |> f |> g ≡ g (f x).
Easier to read: just follow the data flow
No intermediary variables
Natural language in english: Subject Verb.
It's regular in object-oriented code : myObject.do()
In F#, the "subject" is usually the last parameter: List.map f list. Using |>, we get back the natural "Subject Verb" order: list |> List.map f
Final benefit but not the least: help type inference:
let items = ["a"; "bb"; "ccc"]
let longestKo = List.maxBy (fun x -> x.Length) items // ❌ Error FS0072
// ~~~~~~~~
let longest = items |> List.maxBy (fun x -> x.Length) // ✅ return "ccc"
Pipe left operator <|
fn <| expression ≡ fn (expression)
Less used than |>
✅ Small benefit: avoiding parentheses
❌ Major drawback: inverse of the english natural "left to right" reading order and inverse of execution order (because of left-associativity)
printf "%i" 1+2 // 💥 Error
printf "%i" (1+2) // With parentheses
printf "%i" <| 1+2 // With pipe left
What about this kind of expression: x |> fn <| y ❓
In theory, allow using fn in infix position, equivalent of fn x y
In practice, it can be very confusing for some readers not used to it.
👉 It's probably better to avoid using <|
Forward composition operator >>
Binary operator placed between 2 functions:
f >> g ≡ fun x -> g (f x) ≡ fun x -> x |> f |> g
Result of the 1st function is used as argument for the 2nd function
→ types must match: f: 'T -> 'U and g: 'U -> 'V → f >> g :'T -> 'V
let add1 x = x + 1
let times2 x = x * 2
let add1Times2 x = times2(add1 x) // 😕 Style explicit but heavy
let add1Times2' = add1 >> times2 // 👍 Style concise
Backward composition operator <<
f >> g ≡ g << f
Less used than >>, except to get terms in english order:
let even x = x % 2 = 0
// even not 😕
let odd x = x |> even |> not
// "not even" is easier to read 👍
let odd = not << even
☝ Note: << is the mathematical function composition ∘: g ∘ f ≡ fun x -> g (f x) ≡ g << f.
It's confusing in F# because it's >> that is usually called the "composition operator" ("forward" being usually omitted).
On the other hand, the symbols used for these operators are super useful to remember the order of execution of the functions: f >> g means apply f then apply g. Even if argument is implicit, we get the data flow direction:
>> : from left to right → f >> g ≡ fun x -> x |> f |> g
<< : from right to left → f << g ≡ fun x -> f <| (g <| x)
(Edited after good advices from David)

A more functional way to create tuples from two arrays

I've created a function that gets all integers from 1 to n and then combines with the same sequence to create a sequence of tuples of all combinations. So passing it the integer 2 would give you [(1,1);(1,2);(2,1);(2,2)]:
let allTuplesUntil x =
let primary = seq { 1 .. x }
let secondary = seq { 1 .. x }
[for x in primary do
for y in secondary do
yield (x,y)]
This implementations works, but it uses an inner and outer for loop, similar to what I would do in c#.
Could this be achieved in a more idiomatic functional way? Would a more functional way typically be more desirable or is this acceptable in a functional language because of its brevity and clarity?
I'm relatively new to f# and looking for some feedback.
These loops are part of what's called computation expression, which is quite idiomatic to F#. It's just made to look like familiar loops. I can't see any problem with your code being written in this way. If what you want is to get rid of the loops, you could hide them in functions:
let cartesianProduct xs ys =
xs |> Seq.collect (fun x -> ys |> Seq.map (fun y -> x, y))
cartesianProduct [1;2;3] ['a';'b';'c']
val it : seq<int * char> = seq [(1, 'a'); (1, 'b'); (1, 'c'); (2, 'a'); ...]
First, just because there is a for doesn't mean its not functional. In this example you go over each element and yield a new element that will turn into a new element of a new immutable list. Such feature is also named "List Comprehension" and part of languages like Haskell. Imperative would be to loop over a list and mutate the list.
Second, remember that other functions like map, fold, filter also just loop over each element, like a for expression. They are just less powerful than a for loop.
Third, even if it would be "not 100% functional". Who cares? Code should be easily readable and understandable. The intention of two for loops is easy to understand.
Fourth, the equivalent function of the for expression is usually the bind or in this case the Seq.collect function. You also could write, this code.
[for x in primary do
for y in secondary do
yield (x,y)]
Like this:
primary |> Seq.collect (fun x ->
secondary |> Seq.collect (fun y ->
[x,y]
))
I prefer the for loops for readability!

Why data parameter comes last

Why have the data parameter in F# to come last, like the following code snippet shows:
let startsWith lookFor (s:string) = s.StartsWith(lookFor)
let str1 =
"hello"
|> startsWith "h"
I think part of your answer is in your question. The |> (forward pipe) operator lets you specify the last parameter to a function before you call it. If the parameters were in the opposite order, then that wouldn't work. The best examples of the power of this are with chaining of functions that operate on lists. Each function takes a list as its last parameter and returns a list that can be passed to the next function.
From http://www.tryfsharp.org/Learn/getting-started#chaining-functions:
[0..100]
|> List.filter (fun x -> x % 2 = 0)
|> List.map (fun x -> x * 2)
|> List.sum
The |> operator allows you to reorder your code by specifying the last
argument of a function before you call it. This example is
functionally equivalent to the previous code, but it reads much more
cleanly. First, it creates a list of numbers. Then, it pipes that list
of numbers to filter out the odds. Next, it pipes that result to
List.map to double it. Finally, it pipes the doubled numbers to
List.sum to add them together. The Forward Pipe Operator reorganizes
the function chain so that your code reads the way you think about the
problem instead of forcing you to think inside out.
As mentioned in the comments there is also the concept of currying, but I don't think that is as easy to grasp as chaining functions.

Different argument order for getting N-th element of Array, List or Seq

Is there a good reason for a different argument order in functions getting N-th element of Array, List or Seq:
Array.get source index
List .nth source index
Seq .nth index source
I would like to use pipe operator and it seems possible only with Seq:
s |> Seq.nth n
Is there a way to have the same notation with Array or List?
I don't think of any good reason to define Array.get and List.nth this way. Given that pipeplining is very common in F#, they should have been defined so that the source argument came last.
In case of List.nth, it doesn't change much because you can use Seq.nth and time complexity is still O(n) where n is length of the list:
[1..100] |> Seq.nth 10
It's not a good idea to use Seq.nth on arrays because you lose random access. To keep O(1) running time of Array.get, you can define:
[<RequireQualifiedAccess>]
module Array =
/// Get n-th element of an array in O(1) running time
let inline nth index source = Array.get source index
In general, different argument order can be alleviated by using flip function:
let inline flip f x y = f y x
You can use it directly on the functions above:
[1..100] |> flip List.nth 10
[|1..100|] |> flip Array.get 10
    
Just use backward pipe operator:
[1..1000] |> List.nth <| 42
Since both operators are left associative, x |> f <| y is parsed as (x |> f) <| y, and this does the trick.
Backward pipe operator is also useful if you want to remove parentheses: f (very long expression) can be replaced with f <| very long expression.
Since Pad and bytebuster answered your last question I will focus on the why part.
This is based my current knowledge and not historical facts.
Since F# derived from OCaml and OCaml has Array and List but not Seq and F# uses |> for natural pipelining and type checking and OCaml lacks the pipleline operator, the authors of F# made the switch for Seq. But obviously to be backward compatablie with OCaml they did not switch everything.

f# iterating over two arrays, using function from a c# library

I have a list of words and a list of associated part of speech tags. I want to iterate over both, simultaneously (matched index) using each indexed tuple as input to a .NET function. Is this the best way (it works, but doesn't feel natural to me):
let taggingModel = SeqLabeler.loadModel(lthPath +
"models\penn_00_18_split_dict.model");
let lemmatizer = new Lemmatizer(lthPath + "v_n_a.txt")
let input = "the rain in spain falls on the plain"
let words = Preprocessor.tokenizeSentence( input )
let tags = SeqLabeler.tagSentence( taggingModel, words )
let lemmas = Array.map2 (fun x y -> lemmatizer.lookup(x,y)) words tags
Your code looks quite good to me - most of it deals with some loading and initialization, so there isn't much you could do to simplify that part. Alternatively to Array.map2, you could use Seq.zip combined with Seq.map - the zip function combines two sequences into a single one that contains pairs of elements with matching indices:
let lemmas = Seq.zip words tags
|> Seq.map (fun (x, y) -> lemmatizer.lookup (x, y))
Since lookup function takes a tuple that you got as an argument, you could write:
// standard syntax using the pipelining operator
let lemmas = Seq.zip words tags |> Seq.map lemmatizer.lookup
// .. an alternative syntax doing exactly the same thing
let lemmas = (words, tags) ||> Seq.zip |> Seq.map lemmatizer.lookup
The ||> operator used in the second version takes a tuple containing two values and passes them to the function on the right side as two arguments, meaning that (a, b) ||> f means f a b. The |> operator takes only a single value on the left, so (a, b) |> f would mean f (a, b) (which would work if the function f expected tuple instead of two, space separated, parameters).
If you need lemmas to be an array at the end, you'll need to add Array.ofSeq to the end of the processing pipeline (all Seq functions work with sequences, which correspond to IEnumerable<T>)
One more alternative is to use sequence expressions (you can use [| .. |] to construct an array directly if that's what you need):
let lemmas = [| for wt in Seq.zip words tags do // wt is tuple (string * string)
yield lemmatizer.lookup wt |]
Whether to use sequence expressions or not - that's just a personal preference. The first option seems to be more succinct in this case, but sequence expressions may be more readable for people less familiar with things like partial function application (in the shorter version using Seq.map)

Resources